<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-13033340</id><updated>2012-01-27T08:12:58.940-05:00</updated><category term='Python'/><category term='Interworking'/><category term='Document'/><category term='Bamboo'/><category term='Unstructured'/><category term='H.323'/><category term='softcard'/><category term='Comparison'/><category term='ActionScript'/><category term='rtc-web'/><category term='Tcl'/><category term='SIP'/><category term='Protocols'/><category term='Firewall'/><category term='restlite'/><category term='policy-file'/><category term='Skype'/><category term='Programming'/><category term='programmable'/><category term='Google Video'/><category term='Gateway'/><category term='C++'/><category term='Open source'/><category term='Flash Player'/><category term='gevent'/><category term='SIP-XMPP'/><category term='multitask'/><category term='reliability'/><category term='IAX'/><category term='Software'/><category term='video'/><category term='Flex'/><category term='Kazaa'/><category term='RTMFP'/><category term='performance'/><category term='Communication'/><category term='P2P-SIP'/><category term='videocity'/><category term='database'/><category term='Channel API'/><category term='RTMP'/><category term='idea'/><category term='RFC'/><category term='siprtmp'/><category term='Internet'/><category term='Structured'/><category term='scalability'/><category term='rtmplite'/><category term='authentication'/><category term='REST'/><category term='Chord'/><category term='security'/><category term='XMPP'/><category term='Problems'/><category term='crossdomain'/><category term='Inspiration'/><category term='API'/><category term='multimedia'/><category term='Systems'/><category term='Google App Engine'/><category term='NAT'/><category term='Conferencing'/><category term='Business'/><category term='webrtc'/><category term='RESTful'/><category term='DHT'/><category term='Evolution'/><category term='Specification'/><category term='H.264'/><category term='memcached'/><category term='server'/><category term='ICE'/><category term='39 Peers'/><category term='event based'/><category term='RTP'/><category term='P2P'/><category term='AMF'/><category term='vvowproject'/><category term='Lessons'/><title type='text'>P2P-SIP</title><subtitle type='html'>Peer-to-peer Internet telephony using SIP</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>61</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-13033340.post-120661825196447087</id><published>2012-01-22T21:28:00.002-05:00</published><updated>2012-01-23T18:44:07.239-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='RTP'/><category scheme='http://www.blogger.com/atom/ns#' term='H.264'/><category scheme='http://www.blogger.com/atom/ns#' term='siprtmp'/><title type='text'>Translating H.264 between Flash Player and SIP/RTP</title><content type='html'>Our SIP-RTMP gateway as part of the &lt;a href="http://code.google.com/p/rtmplite"&gt;rtmplite&lt;/a&gt; project includes the translation of packetization between Flash Player's RTMP and SIP/RTP for H.264. There are &lt;a href="http://p2p-sip.blogspot.com/2011/12/three-problems-in-interoperating-with.html"&gt;some hurdles&lt;/a&gt;, but it is doable! In this article I present what it takes to do such an interoperability. If you are interested in looking at the implementation, please see the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;_rtmp2rtpH264&lt;/span&gt; and &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;_rtp2rtmpH264&lt;/span&gt; functions in the &lt;a href="http://code.google.com/p/rtmplite/source/browse/trunk/siprtmp.py"&gt;siprtmp.py&lt;/a&gt; module.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Before jumping in to the details, let us take a brief background of H.264 packetization. The encoder encodes sequence of frames or pictures to generate the encoded stream, which is consumed by the decoder to re-create the video. The encoder generates what is called as NALU or &lt;i&gt;network abstraction layer unit&lt;/i&gt;. The decoder works on a single NALU and needs sequence of NALUs to decode. Each frame can have one or more slices. Each slice can be encoded in one or more NALUs. There are certain pieces of information that remain same for all or many frames. For example, the sequence parameter set (SPS) and picture parameter set (PPS) are like configuration elements that need to be sent once or only occasionally instead of with every frame or NALU. The configuration parameters apply to the encoder, whereas the decoder should be able to decode any configuration.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;RTMP Payload&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Flash Player 11+ is capable of capturing from camera and encoding in H.264 to send to an RTMP stream. Each RTMP message contains header and data (or payload), where the header contains crucial information such as timestamp, stream identifier, and the payload contains the encoded video NALUs or actual configuration data. The format of the payload is same as that of the F4V/FLV tag for H.264 video in an FLV file. Each RTMP message contains one frame but may contain more than one NALUs. The first byte contains the encoding type, and for H.264 is either &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x17&lt;/span&gt; (for intra-frame) or &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x27&lt;/span&gt; (for non-intra frame). The second byte contains packet type and is either &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x00&lt;/span&gt; (configuration data) or &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x01&lt;/span&gt; (picture data). The configuration data contains both SPS and PPS as described here. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;pre&gt;  rtmp-payload := enc-type[1B] | type[1B] | remaining&lt;br /&gt;  enc-type := is-intra[4b] | codec-type[4b]&lt;br /&gt;  is-intra := 1 if intra and 2 if non-intra&lt;br /&gt;  codec-type := 7 for H.264/AVC&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;If the type is configuration data then the next four bytes are configuration version (&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x01&lt;/span&gt;), the profile index, the profile compatibility and the level index. This is followed by one byte containing least-significant two-bits that determine the number of bytes to use for the length of the NALU in subsequent picture data messages. For example, if the bits are &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;11b&lt;/span&gt; then it indicates 3+1=4 bytes of NALU length, and if the bits are &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;01b&lt;/span&gt; then it indicates 1+1=2 bytes of NALU length. Lets call this the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;length-size&lt;/span&gt; and possible values are 1, 2 or 4. This is followed by a byte containing least-significant 5 bit for the number of subsequent SPS blocks. Each SPS block is prefixed by 16-bits length followed by the bit-wise encoding of SPS as per H.264 specification. This is followed by a byte containing the number of subsequent PPS blocks. Each PPS block is prefixed by 16-bits length followed by the bit-wise encoding of PPS as per H.264 specification. Typically only one SPS and one PPS blocks are present.&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;  remaining for config := version[1B] | profile-idc[1B]&lt;br /&gt;      | profile-compat[1B] &lt;br /&gt;      | level-idc[1B]&lt;br /&gt;      | length-flag[1B]&lt;br /&gt;      | sps-count[1B] | sps0 ...&lt;br /&gt;      | pps-count[1B] | pps0 ...&lt;br /&gt;  length-flag := 0[6b] | value[2b] where value + 1 is length-size&lt;br /&gt;  sps-count := 0[3b] | count[5b] where count is number of sps&lt;br /&gt;  pps-count := number of pps elements&lt;br /&gt;  sps(n) := length[2B] | sps&lt;br /&gt;  pps(n) := length[2B] | pps&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;If the type is picture data then the next three bytes contain a 24-bit number for the decoder delay value for the frame and is applicable only for B-frames. The default baseline profile does not include the B-frames. Thus the first five bytes of the picture data payload are like header data. This is followed by one or more NALU blocks. Each NALU block is prefixed by the length of the next NALU encoded-bits. The number of bytes used to encode this length is determined by &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;length-size&lt;/span&gt; mentioned earlier. Then the NALU is encoded as per H.264 specification.&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;  remaining-picture := delay[3B] | nalu0 | nalu1 ...&lt;br /&gt;  nalu(n) := length | nalu&lt;br /&gt;  length := number in length-size bytes&lt;br /&gt;  nalu := NAL unit as per H.264&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;Each NALU has first byte of flags. The flags contains 1 most-significant bit of forbidden, next 2-bits of &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nri&lt;/span&gt; (NAL reference index) and final 5 least-significant bits of &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nal-type&lt;/span&gt;. There are several nal-types such as &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x01&lt;/span&gt; for non-intra regular pictures, &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x05&lt;/span&gt; for intra-pictures, etc. Please see the H.264 specification for the complete list. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The camera captured and encoded data in Flash Player contains three NALUs in each RTMP message -- the access unit delimiter (nal-type &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x06&lt;/span&gt;), the timing-information (nal-type &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x09&lt;/span&gt;) and the picture slice (nal-type &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x01&lt;/span&gt; or &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;0x05&lt;/span&gt;). The Flash Player is capable of decoding other nal-types as well, and does not require access unit delimiter or timing-information NALUs for decoding. I haven't seen any support for aggregated or fragmented NALUs in the Flash Player.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;RTP Payload&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The RTP payload format for H.264 is specified in &lt;a href="http://tools.ietf.org/html/rfc6184"&gt;RFC 6184&lt;/a&gt; and is typically supported in SIP-based video phones. The RTP header contains the crucial information such as the payload type, the timing data, and the sequence number, whereas the actual configuration and picture NALUs are sent in the payload as specified by this RFC. The first byte is the type containing one bit forbidden, two bits of &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nri&lt;/span&gt; and 5 bits of &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nal-type&lt;/span&gt;.&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;  nalu := nal-flags[1B] | encoded-data&lt;br /&gt;  nal-flags := forbidden[1b] | nri[2b] | nal-type [5b]&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;In addition to the base nal-types of H.264, the RFC defines new nal-types for fragmentation and aggregation. Traditionally, the Internet plagued by middle-boxes, NATs and firewalls has imposed a limit on the size of the UDP packet that can be pragmatically used on the Internet, and the typically MTU is around 1400-1500 bytes. The H.264 encoder is capable of generating much larger encoded frame sizes hence cannot be successfully sent as one frame per RTP packet over UDP in many cases. On the other hand, some low-sized encoded frames may be much smaller than MTU thus incurring additional overhead for RTP headers. These low-sized frames can be aggregated for efficiency.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Many SIP video phones configure their H.264 encoders to use multiple slice NALUs in a single frame, unlike Flash Player which generates one picture NALU per frame. Thus the traditional SIP video phones are capable of using low sized encoded payload without RFC 6184 which can be sent in a single RTP/UDP packet.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When a large encoded frame is fragmented to smaller fragments, the nal-type=28 is used in the first byte of each fragment, followed by the second byte containing the actual nal-type of the frame as well as the start and end markers. This is followed by the actual encoded data. The RTP header of all these fragments contain the same timestamp value. The last fragment of the frame contains the marker set to true, whereas all the previous ones set it to false. When multiple smaller frames are aggregated, the nal-type of 24 is used in the first byte of the aggregate payload, followed by one or more NAL data. Each NAL data is prefixed by 16-bit length of the encoded NALU. There are non-trivial rules on how the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nri&lt;/span&gt; is obtained and we refer you to the RFC for the details.&lt;/div&gt;&lt;pre&gt;&lt;br /&gt;To fragment:&lt;br /&gt;  let encoded-data = fragment0 | fragment1 | fragment2...&lt;br /&gt;  encoded-data of fragment(n) := orig-nal-flags[1B] | fragment(n)&lt;br /&gt;  orig-nal-flags := start[1b] | end[1b] | ignore[1b] &lt;br /&gt;     | orig-nal-type[5b]&lt;br /&gt;  start := 1 if first fragment else 0&lt;br /&gt;  end := 1 if last fragment else 0&lt;br /&gt;&lt;br /&gt;To aggregate:&lt;br /&gt;  encoded-data of aggregate := nalu0 | nalu1 | nalu2 ...&lt;br /&gt;  nalu(n) := length[2B] | orig-nalu(n)&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;In additional to sending the SPS and PPS packets in RTP, the video phones also negotiate the configuration data via external protocol such as SIP/SDP. Since Flash Player does not do that, we will not discuss it further.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Translating&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now that we understand the packetization of H.264 for Flash Player as well as SIP/RTP, let us go over the details of the translation process.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The configuration data is sent periodically by Flash Player before every intra-picture frame. However, SIP phones may not send the configuration data periodically. It is desirable to cache the configuration data received from both sides, and re-use it when the other side connects. The first packet sent must contain the configuration data. It is also desirable to periodically send the configuration data to both Flash Player and SIP sides from the translator, irrespective of whether the configuration data is received periodically. In our translator we send the configuration data before every infra frame.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Flash Player to SIP/RTP direction, when the configuration data is received on RTMP, it is sent in two RTP packets, one for SPS and one for PPS. Both use the same timestamp and set the marker to true. When picture data is received on RTMP and need to be sent to the RTP side, it is dropped until a previous configuration data has been sent to the RTP side. If the picture data is not dropped, all the NALUs are extracted. The last out-of-three NALUs per RTMP message is the actual picture NALU which is sent to the RTP side as follows. Only the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nal-type&lt;/span&gt; of 1 and 5 are used, whereas others are ignored. If the NAL size is less than 1500 bytes, it is used as is in the RTP payload with marker set to true. If the NAL size is more, it is fragmented in to smaller fragments with each of size at most 1500 bytes. Multiple fragmented RTP packets are generated as per the RFC. All but the last fragment has marker set to false. The RTP marker of true indicates end of frame. All the fragments use the same timestamp value.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the SIP/RTP to Flash Player direction, the configuration data is received in multiple RTP packets and are cached by the translator. When both SPS and PPS payloads have been received from the RTP side, we are ready to start streaming to the Flash Player side. Any incoming RTP packet is put in a queue. When the last packet in the queue (that was most recently received) has marker set to true, the queue is examined and RTMP messages are created to be sent to the Flash Player side. Since Flash Player handles complete frames in each RTMP message, we need to wait until the marker is set to true so that we only send complete frames to Flash Player. If the RTMP stream is ready but we have not received the configuration data from RTP or we have not or are not sending the first intra frame to RTMP, then received packets are dropped. If no infra frames are received for 5 seconds, then we send a fast-intra-update (FIR) request to the SIP/RTP side, so that it triggers the SIP phone to send an intra frame. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once we decide that we can send packets to RTMP from the received RTP queue, we divide the queue in to groups of packets of same timestamp and same nal-type values while preserving the order of the packets. If the nal-type is 5 indicating that an intra-frame is being sent to RTMP, then we send a configuration data too before the actual picture data.  The configuration payload format is explained earlier and contains both PPS and SPS along with other elements. Each &lt;i&gt;group&lt;/i&gt; of packets of the same timestamp and same nal-type is sent as a single RTMP message in the same order containing one or more NALUs. If the nal-type is 1 or 5, the NALU from the RTP payload is used as is in the RTMP payload with five bytes of header as explained earlier. If the nal-type is 28 indicating fragmented packets, then all the fragmented payloads are combined in to a single NALU. If the nal-type is 24 indicating aggregated packet, then it is split in to individual NALU data. Then the sequence of NALUs generated from this group of packets of same timestamp and nal-type are combined in to a single RTMP payload to be sent to the Flash Player.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Gotchas&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As mentioned in my previous article, there are a few gotchas. You must use the new-style RTMP handshake, otherwise the Flash Player will not decode/display the received H.264 stream. You must use Flash Player 11.2 (beta) or later when using "live" mode, otherwise the Flash Player does not accept multiple slice NALUs of a single frame. If audio and video are enabled, then the timestamp of video must be synchronized with the timestamp of audio sent to RTMP. Note that RTP picks random initial timestamp for each media stream so the audio and video RTP timestamp values are not easily co-related unless using RTCP or external mechanism. You need to co-related the RTP timestamps of audio and video to a single timestamp clock of RTMP.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It is possible to do re-packetization of H.264 between Flash Player's RTMP and standard SIP/RTP without having to do actual video transcoding. This article explains the tricks and gotchas of doing so! &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The implementation works between Flash Player 11.2 and a few SIP video phones such as Ekiga and Bria 3.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;References&lt;/b&gt;&lt;/div&gt;&lt;div&gt;[1] &lt;a href="http://code.google.com/p/rtmplite/source/browse/trunk/siprtmp.py"&gt;Source code of SIP-RTMP translation&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;[2] &lt;a href="http://p2p-sip.blogspot.com/2011/12/three-problems-in-interoperating-with.html"&gt;Three problems in interoperating with H.264 of Flash Player&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;[3] &lt;a href="https://bugbase.adobe.com/index.cfm?event=bug&amp;amp;id=2991202"&gt;Flash Player bug 2991202 fixed in version 11.2 (beta)&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;[4] &lt;a href="http://tools.ietf.org/html/rfc6184"&gt;RFC 6184: RTP payload format for H.264 video&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;[5] &lt;a href="http://download.macromedia.com/f4v/video_file_format_spec_v10_1.pdf"&gt;F4V/FLV video file format specification&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;[6] ITU-T recommendation H.264, "advanced video coding for generic audiovisual services", March 2010.&lt;/div&gt;&lt;div&gt;[7] ISO/IEC International Standard 14496-10:2008.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-120661825196447087?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/120661825196447087/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=120661825196447087' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/120661825196447087'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/120661825196447087'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2012/01/translating-h264-between-flash-player.html' title='Translating H.264 between Flash Player and SIP/RTP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-72211057416792908</id><published>2011-12-14T02:39:00.000-05:00</published><updated>2011-12-13T22:18:14.179-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMFP'/><category scheme='http://www.blogger.com/atom/ns#' term='Protocols'/><title type='text'>Understanding RTMFP Handshake</title><content type='html'>(Disclaimer: the protocol description here is not an official specification of RTMFP but just the protocol understanding based on the &lt;a href="https://github.com/OpenRTMFP/Cumulus"&gt;OpenRTMFP's Cumulus&lt;/a&gt; project as well as the &lt;a href="http://www.ietf.org/proceedings/10mar/slides/tsvarea-1.pdf"&gt;IETF presentation slides&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Introduction&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;What is RTMFP?&lt;/span&gt; RTMFP (Real-time media flow protocol) allows UDP-based low-latency end-to-end media path between two Flash Player instances. Compared to earlier RTMP-based media path which runs over TCP, this new protocol enables actual real-time communication on the web. Although the end-to-end media path is not always possible when certain types of NATs and firewalls are present, it is possible to do end-to-end media across most residential-type NATs. The end-to-end media path between two Flash Player reduces latency as well as scalability of the service (or server infrastructure) since most heavy media traffic can be sent without going through the hosted server. The UDP transport reduces latency compared to TCP transport even if the media-path is client-server.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Why is understanding RTMFP important?&lt;/span&gt; Unlike the earlier RTMP, the new protocol RTMFP is still closed with no open specification available. There have been some attempts at reverse engineering the protocol for interoperability and some official slides explaining the core logic. Understanding the wire-protocol is not important if you are building Flash-based applications that work among each other. However for applications such as Flash-to-SIP gateway or Flash-to-RTSP translator, where you may need to interoperate between RTMFP and SIP/RTP, it is important to understand the wire-protocol in detail. For a Flash-to-SIP gateway incorporating RTMFP from the Flash side in addition to the existing RTMP will enable low-latency UDP media path between the web user and the translator service on the Internet.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;The following description is reproduced from a contribution (see &lt;a href="http://code.google.com/p/rtmplite/source/browse/trunk/rtmfp.py"&gt;rtmfp.py&lt;/a&gt;) to my RTMP server project.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Session&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;An RTMFP session is an end-to-end bi-directional pipe between two UDP transport addresses. A transport address contains an IP address and port number, e.g., "192.1.2.3:1935". A session can have one or more flows where a flow is a logical path from one entity to another via zero or more intermediate entities. UDP packets containing encrypted RTMFP data are exchanged in a session. A packet contains one or more messages. A packet is always encrypted using AES with 128-bit keys.&lt;br /&gt;&lt;br /&gt;In the protocol description below, all numbers are in network byte order (big-endian). The | operator indicates concatenation of data. The numbers are assumed to be unsigned unless mentioned explicitly.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Scrambled Session ID&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The packet format is as follows. Each packet has the first 32 bits of scrambled session-id followed by encrypted part. The scrambled (instead of raw) session-id makes it difficult if not impossible to mangle packets by middle boxes such as NATs and layer-4 packet inspectors. The bit-wise XOR operator is used to scramble the first 32-bit number with subsequent two 32-bit numbers. The XOR operator makes it possible to easily unscramble.&lt;br /&gt;&lt;pre&gt;packet := scrambled-session-id | encrypted-part&lt;/pre&gt;&lt;br /&gt;To scramble a session-id,&lt;br /&gt;&lt;pre&gt;scrambled-session-id = a^b^c&lt;/pre&gt;&lt;br /&gt;where ^ is the bit-wise XOR operator, &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;a&lt;/span&gt; is session-id, and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;b&lt;/span&gt; and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;c&lt;/span&gt; are two 32-bit numbers from the first 8 bytes of the encrypted-part.&lt;br /&gt;&lt;br /&gt;To unscramble,&lt;br /&gt;&lt;pre&gt;session-id = x^y^z&lt;/pre&gt;&lt;br /&gt;where &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;z&lt;/span&gt; is the scrambled-session-id, and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;b&lt;/span&gt; and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;c&lt;/span&gt; are two 32-bit numbers from the first 8 bytes of the encrypted-part.&lt;br /&gt;&lt;br /&gt;The session-id determines which session keys are used for encryption and decryption of the encrypted part. There is one exception for the fourth message in the handshake which contains the non-zero session-id but the handshake (symmetric) session keys are used for encryption/decryption. For the handshake messages, a symmetric AES (advanced encryption standard) with 128-bit (16 bytes) key of &lt;tt&gt;"Adobe Systems 02"&lt;/tt&gt; (without quotes) is used. For subsequent in-session messages the established asymmetric session keys are used as described later.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Encryption&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Assuming that the AES keys are known, the encryption and decryption of the encrypted-part is done as follows. For decryption, an initialization vector of all zeros (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;0&lt;/span&gt;'s) is used for every decryption operation. For encryption, the raw-part is assumed to be padded as described later, and an initialization vector of all zeros (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;0&lt;/span&gt;'s) is used for every encryption operation. The decryption operation does not add additional padding, and the byte-size of the encrypted-part and the raw-part must be same.&lt;br /&gt;&lt;br /&gt;The decrypted raw-part format is as follows. It starts with a 16-bit checksum, followed by variable bytes of network-layer data, followed by padding. The network-layer data ignores the padding for convenience.&lt;br /&gt;&lt;pre&gt;raw-part := checksum | network-layer-data | padding&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The padding is a sequence of zero or more bytes where each byte is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\xff&lt;/span&gt;. Since it uses 128-bit (16 bytes) key, padding ensures that the size in bytes of the decrypted part is a multiple of 16. Thus, the size of padding is always less than 16 bytes and is calculated as follows:&lt;br /&gt;&lt;pre&gt;len(padding) = 16*N - len(network-layer-data) - 1&lt;/pre&gt;&lt;br /&gt;where N is any positive number to make &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;0 &amp;lt;= padding-size &amp;lt; 16&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;For example, if network-layer-data is 84 bytes, then padding is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;16*6-84-1=11&lt;/span&gt; bytes. Adding a padding of 11 bytes makes the decrypted raw-part of size 96 which is a multiple of 16 (bytes) hence works with AES with 128-bit key.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Checksum&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The checksum is calculated over the concatenation of network-layer-data and padding. Thus for the encoding direction you should apply the padding followed by checksum calculation and then AES encrypt, and for the decoding direction you should AES decrypt, verify checksum and then remove the (optional) padding if needed. Usually padding removal is not needed because network-layer data decoders will ignore the remaining data anyway.&lt;br /&gt;&lt;br /&gt;The 16-bit checksum number is calculated as follows. The concatenation of network-layer-data and padding is treated as a sequence of  16-bit numbers. If the size in bytes is not an even number, i.e., not divisible by 2, then the last 16-bit number used in the checksum calculation has that last byte in the least-significant position (weird!). All the 16-bit numbers are added in to a 32-bit number. The first 16-bit and last 16-bit numbers are again added, and the resulting  number's first 16 bits are added to itself. Only the least-significant 16 bit part of the resulting sum is used as the checksum.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Network Layer Data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The network-layer data contains flags, optional timestamp, optional timestamp echo and one or more chunks.&lt;br /&gt;&lt;pre&gt;network-layer-data = flags | timestamp | timestamp-echo | chunks ...&lt;/pre&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;The flags value is a single byte containing these information: time-critical forward notification, time-critical reverse notification,  whether timestamp is present? whether timestamp echo is present and initiator/responder marker. The initiator/responder marker is useful if the symmetric (handshake) session keys are used for AES, so that it protects against packet loopback to sender.&lt;br /&gt;&lt;br /&gt;The bit format of the flags is not clear, but the following applies. For the handshake messages, the flags is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x0b&lt;/span&gt;. When the flags' least-significant 4-bits are &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;1101b&lt;/span&gt; then the timestamp-echo is present. The timestamp seems to be always present. For in-session messages, the last 4-bits are either &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;1101b&lt;/span&gt; or &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;1001b&lt;/span&gt;.&lt;br /&gt;&lt;pre&gt;--------------------------------------------------------------------&lt;br /&gt;flags      meaning&lt;br /&gt;--------------------------------------------------------------------&lt;br /&gt;0000 1011  setup/handshake&lt;br /&gt;0100 1010  in-session no timestamp-echo (server to Flash Player)&lt;br /&gt;0100 1110  in-session with timestamp-echo (server to Flash Player)&lt;br /&gt;xxxx 1001  in-session no timestamp-echo (Flash Player to server)&lt;br /&gt;xxxx 1101  in-session with timestamp-echo (Flash Player to server)&lt;br /&gt;--------------------------------------------------------------------&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;TODO: looks like bit &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x04&lt;/span&gt; indicates whether timestamp-echo is present. Probably &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x80&lt;/span&gt; indicates whether timestamp is present. last two bits of &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;11b&lt;/span&gt; indicates handshake, &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;10b&lt;/span&gt; indicates server to client and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;01b&lt;/span&gt; indicates client to server.&lt;br /&gt;&lt;br /&gt;The timestamp is a 16-bit number that represents the time with 4 millisecond clock. The wall clock time can be used for  generation of this timestamp value. For example if the current time in seconds is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;tm = 1319571285.9947701&lt;/span&gt; then timestamp is calculated as  follows: &lt;pre&gt;int(time * 1000/4) &amp;amp; 0xffff = 46586&lt;/pre&gt;, i.e., assuming 4-millisecond clock, calculate the clock units and use the least significant 16-bits.&lt;br /&gt;&lt;br /&gt;The timestamp-echo is just the timestamp value that was received in the incoming request and is being echo'ed back. The timestamp and its echo allows the system to calculate the round-trip-time (RTT) and keep it up-to-date.&lt;br /&gt;&lt;br /&gt;Each chunk starts with an 8-bit type, followed by the 16-bit size of payload, followed by the payload of size bytes. Note that &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\xff&lt;/span&gt; is reserved and not used for chunk-type. This is useful in detecting when the network-layer-data has finished and padding has started because padding  uses &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\xff&lt;/span&gt;. Alternatively, &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x00&lt;/span&gt; can also be used for padding as that is reserved type too!&lt;br /&gt;&lt;pre&gt;chunk = type | size | payload&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Message Flow&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are three types of session messages: session setup, control and flows. The session setup is part of the four-way handshake whereas control and flows are in-session messages. The session setup contains initiator hello, responder hello, initiator initial keying, responder initial keying, responder hello cookie change and responder redirect. The control messages are ping, ping reply, re-keying initiate, re-keying response, close, close acknowledge, forwarded initiator hello. The flow messages are user data, next user data, buffer probe, user data ack (bitmap), user data ack (ranges) and flow exception report.&lt;br /&gt;&lt;br /&gt;A new session starts with an handshake of the session setup. Under normal client-server case, the message flow is as follows:&lt;br /&gt;&lt;pre&gt; initiator (client)                target (server)&lt;br /&gt;  |-------initiator hello----------&amp;gt;|&lt;br /&gt;  |&amp;lt;------responder hello-----------|&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Under peer-to-peer session setup case for NAT traversal, the server acts as a forwarder and forwards the hello to another connected  client as follows:&lt;br /&gt;&lt;pre&gt; initiator (client)                forwarder (server)                     target (client)&lt;br /&gt;|-------initiator hello----------&amp;gt;|                                       |&lt;br /&gt;|                                 |---------- forwarded initiator hello--&amp;gt;|&lt;br /&gt;|                                 |&amp;lt;--------- ack -----------------------&amp;gt;|&lt;br /&gt;|&amp;lt;------------responder hello---------------------------------------------|&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Alternatively, the server could redirect to another target by supplying an alternative list of target addresses as follows:&lt;br /&gt;&lt;pre&gt; initiator (client)                redirector (server)                     target (client)&lt;br /&gt;|-------initiator hello----------&amp;gt;|                                 &lt;br /&gt;|&amp;lt;------responder redirect--------|&lt;br /&gt;|-------------initiator hello--------------------------------------------&amp;gt;|&lt;br /&gt;|&amp;lt;------------responder hello---------------------------------------------|&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Note that the initiator, target, forwarder and redirector are just roles for session setup whereas client and server are specific implementations such as Flash Player and Flash Media Server, respectively. Even a server may initiate an initiator hello to a client in which case the server becomes the initiator and client becomes the target for that session. This mechanism is used for the man-in-middle mode in the Cumulus project.&lt;br /&gt;&lt;br /&gt;The initiator hello may be forwarded to another target but the responder hello is sent directly. After that the initiator initial keying and the responder initial keying are exchanged (between the initiator and the responded target directly) to establish the session keys for the session between the initiator and the target. The four-way handshake prevents denial-of-service (DoS) via SYN-flooding and port scanning.&lt;br /&gt;&lt;br /&gt;As mentioned before the handshake messages for session-setup use the symmetric AES key &lt;tt&gt;"Adobe Systems 02"&lt;/tt&gt; (without the quotes), whereas in-session messages use the established asymmetric AES keys. Intuitively, the session setup is sent over pre-established AES cryptosystem, and it creates new asymmetric AES cryptosystem for the new session. Note that a session-id is established for the new session during the initial keying process, hence the first three messages (initiator-hello, responder-hello and initiator-initial-keying) use session-id of &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;0&lt;/span&gt;, and the last responder-initial-keying uses the session-id sent by the initiator in the previous message. This is further explained later.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Message Types&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The 8-bit type values and their meaning are shown below.&lt;pre&gt;---------------------------------&lt;br /&gt;type  meaning&lt;br /&gt;---------------------------------&lt;br /&gt;\x30  initiator hello&lt;br /&gt;\x70  responder hello&lt;br /&gt;\x38  initiator initial keying&lt;br /&gt;\x78  responder initial keying&lt;br /&gt;\x0f  forwarded initiator hello&lt;br /&gt;\x71  forwarded hello response&lt;br /&gt;&lt;br /&gt;\x10  normal user data&lt;br /&gt;\x11  next user data&lt;br /&gt;\x0c  session failed on client side&lt;br /&gt;\x4c  session died&lt;br /&gt;\x01  causes response with \x41, reset keep alive&lt;br /&gt;\x41  reset times keep alive&lt;br /&gt;\x5e  negative ack&lt;br /&gt;\x51  some ack&lt;br /&gt;---------------------------------&lt;br /&gt;&lt;/pre&gt;TODO: most likely the bit &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x01&lt;/span&gt; indicates whether the transport-address is present or not.&lt;br /&gt;&lt;br /&gt;The contents of the various message payloads are described below.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Variable Length Data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The protocol uses variable length data and variable length number. Any variable length data is usually prefixed by its size-in-bytes encoded as a variable length number. A variable length number is an unsigned 28-bit number that is encoded in 1 to 4 bytes depending on its value. To get the bit-representation, first assume the number to be composed of four 7-bit numbers as follows&lt;pre&gt;number = 0000dddd dddccccc ccbbbbbb baaaaaaa (in binary)&lt;br /&gt;where A=aaaaaaa, B=bbbbbbb, C=ccccccc, D=ddddddd are the four 7-bit numbers&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The variable length number representation is as follows:&lt;pre&gt;0aaaaaaa (1 byte)  if B = C = D = 0&lt;br /&gt;0bbbbbbb 0aaaaaaa (2 bytes) if C = D = 0 and B != 0&lt;br /&gt;0ccccccc 0bbbbbbb 0aaaaaaa (3 bytes) if D = 0 and C != 0&lt;br /&gt;0ddddddd 0ccccccc 0bbbbbbb 0aaaaaaa (4 bytes) if D != 0&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Thus a 28-bit number is represented as 1 to 4 bytes of variable length number. This mechanism saves bandwidth since most numbers are small and can fit in 1 or 2 bytes, but still allows values up to 2^28-1 in some cases.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Handshake&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The initiator-hello payload contains an endpoint discriminator (EPD) and a tag. The payload format is as follows:&lt;br /&gt;&lt;pre&gt;initiator-hello payload = first | epd | tag&lt;/pre&gt;&lt;br /&gt;The first (8-bit) is unknown. The next epd is a variable length data that contains an epd-type (8-bit) and epd-value (remaining). Note that any variable length data is prefixed by its length as a variable length number. The epd is typically less than 127 bytes, so only 8-bit length is enough. The tag is a fixed 128-bit (16 bytes) randomly generated data. The fixed sized tag does not encode its length.&lt;br /&gt;&lt;pre&gt;epd = epd-type | epd-value&lt;/pre&gt;&lt;br /&gt;The epd-type is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x0a&lt;/span&gt; for client-server and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x0f&lt;/span&gt; for peer-to-peer session. If epd-type is peer-to-peer, then the epd-value is peer-id whereas if epd-type is client-server the epd-value is the RTMFP URL that the client uses to connect to. The initiator sets the epd-value such that the responder can tell whether the initiator-hello is for them but an eavesdropper cannot deduce the identity from that epd. This is done, for example, using an one-way hash function of the identity.&lt;br /&gt;&lt;br /&gt;The tag is chosen randomly by the initiator, so that it can match the response against the pending session setup. Once the setup is complete the tag can be forgotten.&lt;br /&gt;&lt;br /&gt;When the target receives the initiator-hello, it checks whether the epd is for this endpoint. If it is for "another" endpoint, the initiator-hello is silently discarded to avoid port scanning. If the target is an introducer (server) then it can respond with an responder, or redirect/proxy the message with forwarded-initiator-hello to the actual target. In the general case, the target responds with responder-hello.&lt;br /&gt;&lt;br /&gt;The responder-hello payload contains the tag echo, a new cookie and the responder certificate. The payload format is as follows:&lt;br /&gt;&lt;pre&gt;responder-hello payload = tag-echo | cookie | responder-certificate&lt;/pre&gt;&lt;br /&gt;The tag echo is same as the original tag from the initiator-hello but encoded as variable length data with variable length size. Since the tag is 16 bytes, size can fit in 8-bits.&lt;br /&gt;&lt;br /&gt;The cookie is a randomly and statelessly generated variable length data that can be used by the responder to only accept the next message if this message was actually received by the initiator. This eliminates the "SYN flood" attacks, e.g., if a server had to store the initial state then an attacker can overload the state memory slots by flooding with bogus initiator-hello and prevent further legitimate initiator-hello messages. The SYN flooding attack is common in TCP servers. The length of the cookie is 64 bytes, but stored as a variable length data.&lt;br /&gt;&lt;br /&gt;The responder certificate is also a variable length data containing some opaque data that is understood by the higher level crypto system of the application. In this application, it uses the diffie-hellman (DH) secure key exchange as the crypto system.&lt;br /&gt;&lt;br /&gt;Note that multiple EPD might map to the single endpoint, and the endpoint has single certificate. A server that does not care about the man-in-middle attack or does not create secure EPD can generate random certificate to be returned as the responder certificate.&lt;br /&gt;&lt;pre&gt;certificate = \x01\x0A\x41\x0E | dh-public-num | \x02\x15\x02\x02\x15\x05\x02\x15\x0E&lt;/pre&gt;&lt;br /&gt;Here the dh-public-num is a 64-byte random number used for DH secure key exchange.&lt;br /&gt;&lt;br /&gt;The initiator does not open another session to the same target identified by the responder certificate. If it detects that it already has an open session with the target it moves the new flow requests to the existing open session and stops opening the new session. The responder has not stored any state so does not need to care. (In our implementation we do store the initial state for simplicity, which may change later). This is one of the reason why the API is flow-based rather than session-based, and session is implicitly handled at the lower layer.&lt;br /&gt;&lt;br /&gt;If the initiator wants to continue opening the session, it sends the initiator-initial-keying message. The payload is as follows:&lt;br /&gt;&lt;pre&gt;initiator-initial-keying payload = initiator-session-id | cookie-echo&lt;br /&gt;| initiator-certificate | initiator-component | 'X'&lt;/pre&gt;&lt;br /&gt;Note that the payload is terminated by &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x58&lt;/span&gt; (or 'X' character).&lt;br /&gt;&lt;br /&gt;The initiator picks a new session-id (32-bit number) to identify this new session, and uses it to demultiplex subsequent received packet. The responder uses this initiator-session-id as the session-id to format the scrambled session-id in the packet sent in this session.&lt;br /&gt;&lt;br /&gt;The cookie-echo is the same variable length data that was received in the responder-hello message. This allows the responder to relate this message with the previous responder-hello message. The responder will process this message only if it thinks that the cookie-echo is valid. If the responder thinks that the cookie-echo is valid except that the source address has changed since the cookie was generated it sends a cookie change message to the initiator.&lt;br /&gt;&lt;br /&gt;In this DH crypto system, &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;p&lt;/span&gt; and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;g&lt;/span&gt; are publicly known. In particular, &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;g&lt;/span&gt; is 2, and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;p&lt;/span&gt; is a 1024-bit number. The initiator picks a new random 1024-bit DH private number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;x1&lt;/span&gt;) and generates 1024-bit DH public number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;y1&lt;/span&gt;) as follows.&lt;br /&gt;&lt;pre&gt;y1 = g ^ x1 % p&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The initiator-certificate is understood by the crypto system and contains the initiator's DH public number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;y1&lt;/span&gt;) in the last 128 bytes.&lt;br /&gt;&lt;br /&gt;The initiator-component is understood by the crypto system and contains an initiator-nonce to be used in DH algorithm as described later.&lt;br /&gt;&lt;br /&gt;When the target receives this message, it generates a new random 1024-bit DH private number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;x2&lt;/span&gt;) and generates 1024-bit DH public number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;y2&lt;/span&gt;) as follows.&lt;br /&gt;&lt;pre&gt;y2 = g ^ x2 % p&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now that the target knows the initiator's DH public number (&lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;y1&lt;/span&gt;) and it generates the 1024-bit DH shared secret as follows.&lt;br /&gt;&lt;pre&gt;shared-secret = y1 ^ x2 % p&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The target generates a responder-nonce to be sent back to the initiator. The responder-nonce is as follows.&lt;br /&gt;&lt;pre&gt;responder-nonce = \x03\x1A\x00\x00\x02\x1E\x00\x81\x02\x0D\x02 | responder's DH public number&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The peer-id is the 256-bit SHA256 (hash) of the certificate. At this time the responder knows the peer-id of the initiator from the initiator-certificate.&lt;br /&gt;&lt;br /&gt;The target picks a new 32-bit responder's session-id number to demultiplex subsequent packet for this session. At this time the server creates a new session context to identify the new session. It also generates asymmetric AES keys to be used for this session using the shared-secret and the initiator and responder nonces as follows.&lt;br /&gt;&lt;pre&gt;decode key = HMAC-SHA256(shared-secret, HMAC-SHA256(responder nonce, initiator nonce))[:16]&lt;br /&gt;encode key = HMAC-SHA256(shared-secret, HMAC-SHA256(initiator nonce, responder nonce))[:16]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The decode key is used by the target to AES decode incoming packet containing this responder's session-id. The encode key is used by the target to AES encode outgoing packet to the initiator's session-id. Only the first 16 bytes (128-bits) are used as the actual AES encode and decode keys.&lt;br /&gt;&lt;br /&gt;The target sends the responder-initial-keying message back to the initiator. The payload is as follows.&lt;br /&gt;&lt;pre&gt;responder-initial-keying payload = responder session-id | responder's nonce | 'X'&lt;/pre&gt;&lt;br /&gt;Note that the payload is terminated by &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x58&lt;/span&gt; (or 'X' character). Note also that this handshake response is encrypted using the symmetric (handshake) AES key instead of the newly generated asymmetric keys.&lt;br /&gt;&lt;br /&gt;When the initiator receives this message it also calculates the AES keys for this session.&lt;br /&gt;&lt;pre&gt;encode key = HMAC-SHA256(shared-secret, HMAC-SHA256(responder nonce, initiator nonce))[:16]&lt;br /&gt;decode key = HMAC-SHA256(shared-secret, HMAC-SHA256(initiator nonce, responder nonce))[:16]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;As before, only the first 16 bytes (128-bits) are used as the AES keys. The encode key of initiator is same as the decode key of the responder and the decode key of the initiator is same as the encode key of the responder.&lt;br /&gt;&lt;br /&gt;When a server acts as a forwarder, it receives an incoming initiator-hello and sends a forwarded-initiator-hello in an existing session to the target. The payload is follows.&lt;br /&gt;&lt;pre&gt;forwarded initiator hello payload := first | epd | transport-address | tag&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first 8-bit value is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x22&lt;/span&gt;. The epd value is same as that in the initiator-hello -- a variable length data containing epd-type and epd-value. The epd-type is &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;\x0f&lt;/span&gt; for a peer-to-peer session. The epd-value is the target peer-id that was received as epd-value in the initiator-hello.&lt;br /&gt;&lt;br /&gt;The tag is echoed from the incoming initiator-hello and is a fixed 16 bytes value.&lt;br /&gt;&lt;br /&gt;The transport address contains a flag for indicating whether the address is private or public, the binary bits of IP address and optional port number. The transport address is that of the initiator as known to the forwarder.&lt;br /&gt;&lt;pre&gt;transport-address := flag | ip-address | port-number&lt;/pre&gt;&lt;br /&gt;The flag is an 8-bit number with the first most significant bit as &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;1&lt;/span&gt; if the port-number is present, otherwise &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;0&lt;/span&gt;. The least significant two bits are &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;10b&lt;/span&gt; for public IP address and &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;01b&lt;/span&gt; for private IP address.&lt;br /&gt;&lt;br /&gt;The ip-address is either 4-bytes (IPv4) or 16-bytes (IPv6) binary representation of the IP address.&lt;br /&gt;&lt;br /&gt;The optional port-number is 16-bit number and is present when the flag indicates so.&lt;br /&gt;&lt;br /&gt;The server then sends a forwarded-hello-response message back to the initiator with the transport-address of the target.&lt;br /&gt;&lt;pre&gt;forwarded-hello-response = transport-address | transport-address | ...&lt;/pre&gt;&lt;br /&gt;The payload is basically one or more transport addresses of the intended target, with the public address first.&lt;br /&gt;&lt;br /&gt;After this the initiator client directly sends subsequent messages to the responder, and vice-versa.&lt;br /&gt;&lt;br /&gt;A normal-user-data message type is used to deal with any user data in the flows. The payload is shown below.&lt;br /&gt;&lt;pre&gt;normal-user-data payload := flags | flow-id | seq | forward-seq-offset | options | data&lt;/pre&gt;&lt;br /&gt;The flags, an 8-bits number, indicate fragmentation, options-present, abandon and/or final. Following table indicates the meaning of the bits from most significant to least significant.&lt;br /&gt;&lt;pre&gt;bit   meaning&lt;br /&gt;0x80  options are present if set, otherwise absent&lt;br /&gt;0x40&lt;br /&gt;0x20  with beforepart&lt;br /&gt;0x10  with afterpart&lt;br /&gt;0x08&lt;br /&gt;0x04&lt;br /&gt;0x02  abandon&lt;br /&gt;0x01  final&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The flow-id, seq and forward-seq-offset are all variable length numbers. The flow-id is the flow identifier. The seq is the sequence number. The forward-seq-offset is used for partially reliable in-order delivery.&lt;br /&gt;&lt;br /&gt;The options are present only when the flags indicate so using the most significant bit as &lt;span class="Apple-style-span" style="font-family:'courier new';"&gt;1&lt;/span&gt;. The options are as follows.&lt;br /&gt;&lt;br /&gt;TODO: define options&lt;br /&gt;&lt;br /&gt;The subsequent data in the fragment may be sent using next-user-data message with the payload as follows:&lt;br /&gt;&lt;pre&gt;next-user-data := flags | data&lt;/pre&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;This is just a compact form of the user data when multiple user data messages are sent in the same packet. The flow-id, seq and forward-seq-offset are implicit, i.e., flow-id is same and subsequent next-user-data have incrementing seq and forward-seq-offset. Options are not present. A single packet never contains data from more than one flow to avoid head-of-line blocking and to enable priority inversion in case of problems.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;TODO&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Will update this article in future:&lt;/div&gt;&lt;div&gt;- Fill in the description of the remaining message flows beyond handshake.&lt;/div&gt;&lt;div&gt;- Describe the man-in-middle mode that enables audio/video flowing through the server.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-72211057416792908?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/72211057416792908/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=72211057416792908' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/72211057416792908'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/72211057416792908'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/12/understanding-rtmfp-handshake.html' title='Understanding RTMFP Handshake'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7668076874447656489</id><published>2011-12-07T00:24:00.002-05:00</published><updated>2011-12-12T18:19:51.185-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='H.264'/><title type='text'>Three Problems in Interoperating with H.264 of Flash Player</title><content type='html'>H.264 decoding has been part of Flash Player since version 9, but H264 encoding was recently added in version 11. Once Flash Player 11 beta was out I started looking in to integrating video translation in the &lt;a href="http://code.google.com/p/rtmplite"&gt;SIP-RTMP gateway&lt;/a&gt; project. For a Flash-to-Flash video conference you do not need to understand the problems related to H.264 in Flash Player because everything is taken care of behind the scenes by Flash Player. Adding H.264 support in the &lt;a href="http://code.google.com/p/flash-videoio"&gt;flash-videoio&lt;/a&gt; project was relatively straight forward. However if you are building your own translator to interoperate video between Flash Player and some other application you will need to understand these problems.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1) The first problem is that Flash Player doesn't enable H.264 even for decoding if the RTMP connection does not use the new-style "secure" handshake. In the older version handshaking with bytes containing zeros worked, but not when using H.264. Eventually I found about this on reading some open-source-flash (osflash) forum post and incorporated it in my gateway.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2) The H.264 encoder generates some sequence headers (called SPS and PPS) which are essential in decoding the rest of the video data packets. The same is true with AAC audio codec. In particular in live H.264 publish mode, Flash Player generates periodic SPS/PPS packets so the other Flash Player (or SIP phone) can join the call later and still be able to start decoding the stream. However, some existing SIP video phones generate the sequence packets only once at the beginning. The SIP-RTMP gateway needed to be modified to cache the sequence packets received from non-Flash Player client and re-send them with correct timestamp to the Flash Player client that joined the stream late.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3) Looks like Flash Player 11.0 changed something related to buffering of live stream, which causes problems if the SIP side generates multiple slice NALU (primitive data units in H.264) per frame. The Flash Player itself generates one NALU per frame, however some existing SIP video phones (e.g., Bria 3) generate old-style multiple slice per frame and one NALU per slice and cannot be decoded and displayed in Flash Player 11 in live mode. You can read more about &lt;a href="https://bugbase.adobe.com/index.cfm?event=bug&amp;amp;id=2991202"&gt;the problem&lt;/a&gt;. This is not a problem for buffered playback though. &lt;b&gt;(update on 12/12/2011 -- I can verify that this bug has been fixed in Flash Player 11.2.202.96 and video call works fine now between Bria 3 and Flash Player via my SIP-RTMP gateway)&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Ekiga SIP phone uses the new-style RTP mechanism for fragmenting a full H.264 frame instead of using multiple slices in H.264 encoding. This can be easily translated to Flash Player and works with my SIP-RTMP gateway. However, Ekiga has another problem in incorrectly interpreting RTP timestamp of received stream which makes it play the stream much slower.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7668076874447656489?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7668076874447656489/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7668076874447656489' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7668076874447656489'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7668076874447656489'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/12/three-problems-in-interoperating-with.html' title='Three Problems in Interoperating with H.264 of Flash Player'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-392646236530225838</id><published>2011-12-05T23:06:00.003-05:00</published><updated>2011-12-06T20:04:02.272-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>The Philosophy of Open Source</title><content type='html'>I recently read a book by Henrik Ingo on "&lt;a href="http://openlife.cc/online"&gt;Open Life: The Philosophy of Open Source&lt;/a&gt;". I strongly recommend software developers as well as technical managers to read it. Here I present some excerpts that I find very interesting in the book.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"If a buyer is willing to pay a lot for it, then a cheap product can be sold at a high price... It is not stupid to ask a high price, but it is to pay it."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"The law of supply and demand can lead to situations that seem strange when common sense is applied to them. ... The oil is no different to the oil that was on sale at a considerably cheaper price just the day before...  When supply goes down, the price goes up - even if all else remains equal."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Often, the kind of stuff branded a trade secret can also be absurdly insignificant, but the important thing is that they don't tell others about it. Today's companies are at least as interested in the things they don't do as the things they pretend to be doing and producing... There's an ominous sense that much of what we do is done with a logic of mean-spiritedness, whether it is in business or in our everyday lives!"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"In a word, Europe's farming policy is based on mean-spiritedness. The subsidies policy is based on farmers agreeing not to produce more food than their agreed quota (to keep the supply low and prices high)."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"The logic of mean-spiritedness that follows from the law of supply and demand, can also be found in all fields of commerce where there is any co called 'immaterial property', including IT, music, film, and other kinds of entertainment, but the most glaring examples of it occur within the world of computers."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"These three demands -- features, quality, and deadline -- would build certain tension into any project. If, for instance, the schedule is too tight, there may not be enough time to include all the features you want. But if a project manager leans on his team, demanding that all the features are included and the deadline be met, then they are compelled to do a rushed job and, inevitably, quality suffers. ... The Open Source community's no-deadlines principle makes excellent sense, and is probably one of the reasons Open Source programs are so successful... One of the most frequently asked questions at the time was, 'When will the next version of Linux be released?' Linus had a stock answer, which was always, 'When it's done.'"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Why do people do things? The first reason is survival. The second reason is to have social life. And the third reason is that people do things for fun. In that order... Since we work to have fun, to enjoy it, then why do we drive ourselves into the ground trying to meet artificial deadlines?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Usually, the vision and business strategies which guide a company are created in the upper echelons of management, after which it's up to the employees to do whatever the boss requires of them...But the principle of 'do whatever you like' would suggest that the management should quit producing the whole vision and business strategies, and focus instead on making it possible for employees to realize their own vision as best they can. (Unfortunately) For many managers such a concept would seem totally alien."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"The lazier the programmer, the more code he writes. .. Typing is too arduous for him, so he writes the code for a word processing program... Because it's too much effort to print out a letter and take it to the postbox, he writes the code for e-mail... So, laziness is a programmer's prime virtue."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"It's not healthy for one's central motivation to be hatred and fear. And what if one day Linux did manage to bring down Microsoft? Would life then lose its meaning? In order to energize themselves, would the programmers then have to find some new and fearful threat to compete against?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"Since the beginning of hacking, Open Source hackers have always made programs to suit their own needs. ... As a client, the Federal Republic of Germany accepted this logic, and they aren't likely to have any reason to complain. Not only did they get what they wanted, they got a high-quality solution, they got it cheap, and they got it fast. What could be unfair about that?"&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"An interesting situation -- IBM had to keep developing Eclipse; yet, financially, investing in it was a bad idea. The solution, of course, was Open Source."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"A company that has calculated its tender openly is much easier to trust. If I were to receive an honest tender of 1,000,000 from a company that operated with open principles, and the tender from a closed company came in at 999,500, I am likely to laugh at the latter and accept the former."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"I once read somewhere about a study which showed that about 20 percent of ants in an anthill do totally stupid things, such as tear down walls the other ants have just built, or take recently gathered food and stow it where none of them will ever find it, or do other things to sabotage everything the other ants so diligently try to achieve. The theory is that these ants don't do these things out of malice but simply because they're stupid... Critics of Open Source projects claim that their non-existent hierarchy and lack of organization leads to inefficiency... If a number of people do some stupid things, we make a rule to say it mustn't be done. Then we need a rule that says everybody has to read the rules. Before long, we need managers and inspectors to make sure people read and follow the rules and that nobody does anything stupid, even by mistake. Finally, the organization has a large group of people who spend time thinking up and writing rules, and enforcing them. And those not involved in doing, are primarily concerned with not breaking the rules...However, Linux and Wikipedia prove the opposite is true... This is particularly true when you factor in that not all planners (managers) are all that smart. Which means organizations risk having their entire staff doing something really inane, because that's what somebody planned. So, it seems better to have a little overlapping and lack of planning, because at least you have better odds for some of the overlapping activities actually making sense..."&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-392646236530225838?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/392646236530225838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=392646236530225838' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/392646236530225838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/392646236530225838'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/12/philosophy-of-open-source.html' title='The Philosophy of Open Source'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3608412093730958889</id><published>2011-12-03T14:03:00.008-05:00</published><updated>2011-12-03T14:34:22.065-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='webrtc'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Evolution'/><category scheme='http://www.blogger.com/atom/ns#' term='Communication'/><category scheme='http://www.blogger.com/atom/ns#' term='Protocols'/><title type='text'>Internet Video Communication: Past, Present and Future</title><content type='html'>&lt;a href="http://2.bp.blogspot.com/-85rQI27Sh30/Ttp5pFQZ5SI/AAAAAAAAACA/WCpEMTJNHqE/s1600/hello123-2011.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 200px; height: 140px;" src="http://2.bp.blogspot.com/-85rQI27Sh30/Ttp5pFQZ5SI/AAAAAAAAACA/WCpEMTJNHqE/s200/hello123-2011.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5681987626573882658" /&gt;&lt;/a&gt;&lt;br /&gt;I gave a presentation last month titled &lt;a href="http://www.slideshare.net/kundan10/hello-1-2-3-can-you-see-me-now"&gt;Hello 1 2 3, can you &lt;strike&gt;hear&lt;/strike&gt; see me now?&lt;/a&gt; highlighting my point of view on the origins of Internet video communication technologies we see today. The full text of the presentation can be found at &lt;a href="http://kundansingh.com/talks/hello123-2011.pdf"&gt;Internet video communication: past, present and future&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Modern video communication systems have roots in several technologies: transporting video over phone lines, using multicast on Internet2's Mbone, adding video to voice-over-IP (VoIP), and adding interactivity in existing streaming applications. Although the Internet telephony and multimedia communication protocols have matured over the last fifteen years, they are largely being used for interconnectivity among closed networks of telecom services. Recently, the world wide web has evolved as a popular platform for everything we do on the Internet including email, text chat, voice calls, discussions, enterprise applications and multi-party collaboration. Unfortunately, there is a disconnect between the web and traditional Internet telephony protocols as they have ignored the constraints and requirements of each other. Consequently, Adobe's Flash Player is being used as a web browser plugin by many developers for voice and video calls over the web.   &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Learning from the mistakes of the past and knowing where we stand at present will help us build the Internet video communication systems of the future. I present my point of view on the evolution, challenges and mistakes of the past, and, moving forward, describe the challenges in bridging the gap between web and VoIP. I highlight my contributions at various stages in the journey of Internet audio/video communication protocols.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3608412093730958889?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3608412093730958889/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3608412093730958889' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3608412093730958889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3608412093730958889'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/12/internet-video-communication-past.html' title='Internet Video Communication: Past, Present and Future'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-85rQI27Sh30/Ttp5pFQZ5SI/AAAAAAAAACA/WCpEMTJNHqE/s72-c/hello123-2011.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2609443052250230548</id><published>2011-08-20T15:57:00.006-04:00</published><updated>2011-08-20T16:32:49.390-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Document'/><category scheme='http://www.blogger.com/atom/ns#' term='siprtmp'/><title type='text'>Flash-based audio and video communications in the cloud</title><content type='html'>&lt;a href="http://www.theintencity.com/siprtmp-screen1.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 255px; height: 166px;" src="http://www.theintencity.com/siprtmp-screen1.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;Internet telephony and multimedia communication protocols have matured over the last fifteen years. Recently, the web is evolving as a popular platform for everything we do on the Internet including email, text chat, voice calls, discussions, enterprise apps and multi-party collaboration. Unfortunately, there is a disconnect between web and traditional Internet telephony protocols as they have ignored the constraints and requirements of each other. Consequently, the Flash Player is being used as a web browser plugin by many developers for web-based voice and video calls. We describe the challenges of video communication using a web browser, present a simple API using a Flash Player application, show how it supports wide range of web communication scenarios in the cloud, and describe how it can interoperate with Session Initiation Protocol (SIP)-based systems. We describe both the advantages and challenges of Flash Player based communication applications. The presented API could guide future work on communication-related web protocol extensions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;More details are available in our &lt;a href="http://arxiv.org/pdf/1107.0011v1"&gt;white-paper&lt;/a&gt;. The associated software and example use cases are available as &lt;a href="http://code.google.com/p/flash-videoio"&gt;flash-videoio&lt;/a&gt; and &lt;a href="http://code.google.com/p/siprtmp/"&gt;siprtmp&lt;/a&gt; projects. The white-paper also serves as the architecture and design document of these projects.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2609443052250230548?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2609443052250230548/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2609443052250230548' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2609443052250230548'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2609443052250230548'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/08/flash-based-audio-and-video.html' title='Flash-based audio and video communications in the cloud'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2567337008630862987</id><published>2011-08-20T15:41:00.004-04:00</published><updated>2011-08-20T16:30:55.301-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='webrtc'/><category scheme='http://www.blogger.com/atom/ns#' term='rtc-web'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='vvowproject'/><title type='text'>Voice and Video Communications on Web</title><content type='html'>&lt;a href="http://www.theintencity.com/webconf-screen1.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 255px; height: 172px;" src="http://www.theintencity.com/webconf-screen1.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;I co-authored and presented a paper on "SIP APIs for voice and video communications on the web" at IPTcomm 2011. The paper compares various alternative architectures, and presents the components of our ongoing project at IIT, Chicago. We are open to sponsorship of the project to further continue its R&amp;amp;D work. Please feel free to get in touch with me or Prof. Davids if you are interested in sponsoring student projects in her lab related to this technology.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;The &lt;a href="http://arxiv.org/pdf/1106.6333v1"&gt;paper&lt;/a&gt; and the presentation &lt;a href="http://www.slideshare.net/kundan10/voice-and-video-communications-on-the-web"&gt;slides&lt;/a&gt; are available. The &lt;a href="https://sites.google.com/site/vvowproject/"&gt;project page&lt;/a&gt;, open &lt;a href="http://code.google.com/p/vvowproject/"&gt;source code&lt;/a&gt;, and free &lt;a href="http://gardo1.rice.iit.edu/webconf/"&gt;demonstration page&lt;/a&gt; are also available.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Abstract: Existing standard protocols for the web and Internet telephony fail to deliver real-time interactive communication from within a web browser. In particular, the client-server web protocol over reliable TCP is not always suitable for end-to-end low latency media path needed for interactive voice and video communication. To solve this, we compare the available platform options using the existing technologies such as modifying the web programming language and protocol, using an existing web browser plugin, and a separate host resident application that the web browser can talk to. We argue that using a separate application as an adaptor is a promising short term as well as long-term strategy for voice and video communications on the web. Our project aims at developing the open technology and sample implementations for web-based real-time voice and video communication applications. We describe the architecture of our project including (1) a RESTful web communication API over HTTP inspired by SIP message flows, (2) a web-friendly set of metadata for session description, and (3) an UDP-based end-to-end media path. All other telephony functions reside in the web application itself and/or in web feature servers. The adaptor approach allows us to easily add new voice and video codecs and NAT traversal technologies such as Host Identity Protocol. We want to make web-based communication accessible to millions of web developers, maximize the end user experience and security, and preserve the huge global investment in and experience from SIP systems while adhering to web standards and development tools as much as possible. We have created an open source prototype that allows you to freely use the conference application by directing a browser to the conference URL.&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2567337008630862987?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2567337008630862987/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2567337008630862987' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2567337008630862987'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2567337008630862987'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/08/voice-and-video-communications-on-web.html' title='Voice and Video Communications on Web'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7202820513075212206</id><published>2011-06-16T08:55:00.001-04:00</published><updated>2011-06-16T13:05:36.490-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Specification'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>A Proposal for Reference Implementation Repository of SIP-related RFCs</title><content type='html'>One of the root causes of non-interoperable implementations is the misinterpretation of the specification. A number of people have claimed that SIP has become complicated and has failed to deliver its promise of mix-and-match interoperability. There are two main reasons: (a) the number of SIP related RFCs and drafts is growing faster than what a developer or product life-cycle can catch up with, and (b) many of the RFCs and drafts are not supported by an open implementation which results in misinterpretation of some aspects of the specification by the programmers.  The job of a SIP programmer is to (1) read the RFC and draft for SIP or its extensions, (2) understand the logic and figure out how it fits in the big picture or how it relates to the existing SIP source code, (3) come up with some object-oriented class diagram, classes' properties and pseudo-code for their methods, and finally (4) implement the classes and methods.&lt;br /&gt;&lt;br /&gt;Clearly the text in RFCs and drafts cannot be as unambiguous as real source code of a computer program. So many programmers may read and implement some features differently, resulting in non-interoperable implementations. Having a readily available pseudo-code for SIP and many of its extensions relieves the programmer of error-prone step (2) above, and resolves any misinterpretation at an early stage. There is a huge cost paid by the vendor or provider for this programmer's misinterpretation of the specification.&lt;br /&gt;&lt;br /&gt;This &lt;b&gt;project proposal&lt;/b&gt; is to keep an open and public repository of reference implementation of RFC 3261 and other SIP-related extensions. If this repository is maintained by public bodies such as SIPForum and open source community, it will enable easy access to developers and enable better interoperability of new extensions.&lt;br /&gt;&lt;br /&gt;The goal of this effort will be to encourage submission of reference implementations by RFC and Internet Draft authors . In case of any ambiguity, the clarification will not only be applied to specification but also to the reference implementation.&lt;br /&gt;&lt;br /&gt;If we use a very high level language such as Python then the reference implementation essentially also serves as a pseudo code, which can be ported to other programming languages. The goal is not to get involved in the syntax of a particular programming language, but just express the ideas more formally to prevent misinterpretation of the specification. Perhaps if Python is not suitable, then a similar high level language syntax can be defined.&lt;br /&gt;&lt;br /&gt;This will greatly simplify the job of a programmer, and in the long term, will result in more interoperable and robust products seamlessly supporting new SIP extensions and features. The programmers will have fewer things to worry about; hence can write more accurate code in the short time. From an specification author's point of view, it will encourage him/her to write more solid and implementable specification without ambiguity, and encourage him/her to provide the pseudo-code in the draft. From a reviewer's point of view, one can easily judge the complexity of various alternatives or features, e.g., one can say that adding the extension 'foo' is just 10 lines of pseudo-code to the base SIP reference implementation.&lt;br /&gt;&lt;br /&gt;It will help RFC and draft authors in seeing the complexity and implementation aspects of their proposal.  Sometimes an internet-draft proposes multiple solutions without any details on them. This is partially due to the lack of implementation and complexity evaluation of the various approaches. With reference implementation and pseudo-code repository, the author can provide a patch to the existing code to evaluate the complexity of the proposal.&lt;br /&gt;&lt;br /&gt;A few years ago I wrote a tool to annotate software source code with RFC/draft, so that when you are reading a class or method in a source code file, you can quickly know which part of the RFC/draft it implements. Please see &lt;a href="http://39peers.net/download/python/doc/html/rfc2617.py.html"&gt;an example here&lt;/a&gt; and &lt;a href="http://39peers.net/download/python/doc/html/rfc3550.py.html"&gt;here&lt;/a&gt;. Such annotations in reference implementation will help in co-relating the RFC/draft with the actual implementation.&lt;br /&gt;&lt;br /&gt;If there is wide support for this proposal, we can raise it to SIPForum or other bodies, we can help get started and bootstrap the repository of reference implementations of a few SIP-related RFCs. Then we can invite contributions from the community and RFC/draft authors towards completing the implementations. Please post your comment to let us know what you think.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7202820513075212206?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7202820513075212206/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7202820513075212206' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7202820513075212206'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7202820513075212206'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/06/proposal-for-reference-implementation.html' title='A Proposal for Reference Implementation Repository of SIP-related RFCs'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-1771396430081474668</id><published>2011-06-12T23:37:00.005-04:00</published><updated>2011-08-20T15:41:23.224-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Conferencing'/><category scheme='http://www.blogger.com/atom/ns#' term='RESTful'/><category scheme='http://www.blogger.com/atom/ns#' term='webrtc'/><category scheme='http://www.blogger.com/atom/ns#' term='REST'/><category scheme='http://www.blogger.com/atom/ns#' term='vvowproject'/><title type='text'>RESTful communication over WebSocket</title><content type='html'>This article shows how to implement generic resource oriented communication on top of synchronous channel such as WebSocket. This is a continuation of my previous article on REST and SIP [&lt;a href="http://p2p-sip.blogspot.com/2009/11/rest-and-sip.html"&gt;1&lt;/a&gt;] and provides more concrete thoughts because I now have an actual implementation of part of this in my web conferencing application. Other people have commented on the idea of REST on WebSocket [&lt;a href="http://www.kimchy.org/rest_and_web_sockets/"&gt;2&lt;/a&gt;]. (Using the term RESTful, which inherently is stateless, is confusing with a stateful WebSocket transport. Changing the title of this article to "Resource oriented and event-based communication over WebSocket" is more appropriate.)&lt;br /&gt;&lt;br /&gt;&lt;div&gt;Following are the basic principles of such a mechanism.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Enable resource-oriented (instead of RPC-style) communication. &lt;/li&gt;&lt;li&gt;Enable asynchronous notification when a resource (or its child resource) changes. &lt;/li&gt;&lt;li&gt;Enable asynchronous event subscribe, publish, and notification.&lt;/li&gt;&lt;li&gt;Enable Unix file system style access control on resources.&lt;/li&gt;&lt;li&gt;Enable the concept of transient vs persistent resources.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Consider my favorite example of web conferencing application. The logged in users list is represented as resource /login relative to the hosting domain, and the calls list as /call. If the provider supports concept of registered subscribers, those can be at /user. For example, /user/kundan10@gmail.com can be my user profile. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now let us look at how the four motivational points apply in this example.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;1) Enable resource-oriented (instead of RPC-style) communication.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Every resource has its own representation, e.g., in JSON or XML. For example, /login/bob@home.com can be {"name": "Bob Smith", "email": "bob@home.com",  "has-video": true, "has-audio": true}. The client-server communication can be over HTTP using standard RESTful or over WebSocket to access these resources.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Over WebSocket, the client sends a request of the form '{"method":"PUT","resource":"/login/bob@home.com", "type":"application/json","entity":{"name":"Bob Smith", ...}}' to login. The server treats this as same as that received on just HTTP using RESTful PUT request.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A resource-oriented (instead of RPC-style) communication allows us to keep all the business logic in the client, which uses the server only as a data store. The standard HTTP methods allow access to such data, e.g., POST to create, PUT to update, GET to read and DELETE to delete. POST is a special method that must return the resource identifier of the newly created resource. For example, when a client does POST /call to create a new call, the server returns {"id": "conf123"} to indicate that the new resource identifier is "conf123" relative to /call and call be accessed at "/call/conf123". &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;2) Enable asynchronous notification when a resource (or its child resource) changes.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Many web applications including web conferencing need the notion of asynchronous notifications, e.g., when a user is online, or a user joins/leaves a call. Traditionally, Internet communication has used protocols such as SIP and XMPP for asynchronous notifications. With the advent of WebSocket (and the popular socket.io project) it is possible to implement persistent client-server connection for asynchronous notifications and messages within the web browser. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this mechanism, a generic notification architecture is applied to resources. A new method named "SUBSCRIBE" is used to subscribe to any resource. A subscriber receives notification whenever the resource or its immediate children are changed (created, updated or deleted). For example, a conference participant sends the following over WebSocket: '{"method":"SUBSCRIBE","resource":"/call/conf123"}'. Whenever the server detects that a new PUT is done for "/call/conf123/participant12" or a new POST is done for "/call/conf123" it sends a notification message to the subscriber over WebSocket: '{"notify":"UPDATE","resource":"/call/conf123","type":"application/json","entity":{ ... child resource}, "create":"participant12"}'. On the other hand, if the moderator does a PUT on "/call/conf123", then the server sends a notification as '{"notify":"PUT","resource":"/call/conf123","type":"application/json", "entity":{... parent resource}}'. In summary, the server generates the notification to both the modified resource "/call/conf123/participant12" as well as the parent resource, "/call/conf123".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The notification message contains a new "notify" attribute instead of re-using the "method" attribute to indicate the type of notification. For example, "PUT", "POST", "DELETE" means that the resource identified in "resource" attribute has been modified using that method by another client. In this case the "type" and "entity" attribute represent the "resource". Similarly, "UPDATE" means that a child resource has been modified and the details of the child resource identifier is in "create", "update" or "delete" attribute. In this case the "type" and "entity" attribute represent the child resource identified in "create", "update" or "delete". &lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The concept of notifications when a resource change is available in ActionScript programming language. For example, a markup text can use width="{size}" to bind the "width" property of a user interface component to the "size" variable. Whenever the "size" changes the "width" is updated. A property change event is dispatched to enable the notification. Similarly in our mechanism, a resource can be subscribed for to detect change in its value or the value of its children resources by the client application.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;3) Enable asynchronous event subscribe, publish, and notification&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The previous point enables a client to receive notification when a resource changes and these notifications are server generated notifications. Additionally, we need a generic end-to-end publish-subscribe mechanism to allow a client to send notification to all the subscribers without dealing with a resource modification. This allows end-to-end notifications from one client to others, via the server.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When a client subscribes to a resource, it also receives generic notifications sent by another client on that resource. A new NOTIFY method is defined. For example, if a client sends '{"method":"NOTIFY","resource":"/login/bob@home.com","type":"application/json","data":{"invite-to":"/call/conf123","invite-id":"6253"}}', and another client is subscribed to /login/bob@home.com, then it receives a notification message as '{"notify":"NOTIFY", "resource":"/login/bob@home.com","type":"application/json","data":{...}}'. In summary, the server just passes the "data" attribute to all the subscribers. The "notify" value of "NOTIFY" means an end-to-end notification generated because another client sent a NOTIFY method.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;In a web conferencing application, most of the notifications are associated with a resource, e.g., conference membership change, presence status change, etc. Some notifications such as call invitation or cancel can be independent of a resource, and the NOTIFY mechanism can be used. For example, sending a NOTIFY to /login/bob@home.com is received by all the subscribers of this resource. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;4) Enable Unix file system style access control on resources.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Without an authentication and access control mechanism, the resource oriented communication protocol described earlier becomes useless. Fortunately, it is possible to design a generic access control mechanism similar to Unix file system. Essentially, each resource is treated as a file and a directory. In analogy, all the child resources of this resource belong to the directory, whereas the resource entity belongs to the file. The service provider can configure top-level directories with user permissions, e.g., anyone can add child to "/user", and once added will be owned by that user. Thus if user Bob creates /user/bob, then Bob owns the subtree of this resource. It is up to Bob to configure the permissions of its child resources. For example, it can configure /user/bob/inbox to be writable by anyone but readable only by self, e.g., permissions "rwx-w--w-". This allows a web based voice and video mail application.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Unlike traditional file system data model with create, update, read and write, we also need permissions bit for subscription. For example, only Bob should be allowed to subscribe to /user/bob so that other users cannot get notifications sent to Bob. The concept of group is little vague but can be incorporated in this mechanism as well. Finally, a notion of anonymous user needs to be added so that any client which does not have account with the service provider can also get permissions if needed. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary, the permissions bits become five bits for each of the four categories: self, group, others-authenticated, others-anonymous. The four bits define permissions to allow create, read, update, write and subscribe. Existing authentication such as HTTP basic/digest, cookies or oAuth based sessions can be used to actually authenticate the client.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;5) Enable the concept of transient vs persistent resources.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In software programming, application developers typically use local and global variables to represent transient and persistent data respectively. A similar concept is needed in the generic resource oriented communication application. So far we have seen how to read, write, update and create resources. Each resource can be transient, so that it is deleted when the client which created the resource is disconnected, or persistent which remains even after the client disconnects. For example, when a client POSTs to /call/conf123, it wants that resource to be transient which gets deleted when the client is disconnected. This causes the resource to be used as a conference membership resource, and the server notifies other participants when an existing participant is disconnected. On the other hand, when a client POSTs to /user/bob@home.com, it wants it to be the persistent user profile which is available even when this user has disconnected.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The concept of transient and persistent in the resource-oriented system allows the client to easily create a variety of applications without having to write custom kludges. In general a new resource should be created as transient by default, unless the client requests a persistent resource. Whenever the client disconnects the WebSocket all the transient resources (or local variables) of that client are deleted, and appropriate notifications are sent to the subscribers as needed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Implementation&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have implemented part of this concept in my web conferencing project. The server side (called as service provider) is a generic PHP "restserver.php" application that accepts WebSocket connections and uses a back-end MySQL database to store and manage resources and subscriptions. Each connected client is assigned a unique client-id. There are two database tables: the resource table has fields resource-id (e.g., "/login/bob@home.com"), parent-resource-id (e.g., "/login"), type (e.g., "application/json"), entity (i.e., actual JSON representation string), and client-id, whereas the subscribe table has fields resource-id of the target resource and client-id of the subscriber. The subscriptions are always transient, i.e., when the client disconnects the all subcribe rows are removed for that client-id. The resources can be transient or persistent. By default any new resource is created as transient and the client-id is stored in that row. When the client disconnects all the resources with the matching client-id are removed and appropriate notifications generated. A client can create persistent resource by supplying "persistent": true attribute in the PUT or POST request, and the server puts empty client-id for that new resource. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The generic "restserver.php" server application can be used in a variety of web communication applications, and we have shown it to work with web conferencing and slides presentation application.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-1771396430081474668?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/1771396430081474668/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=1771396430081474668' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1771396430081474668'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1771396430081474668'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/06/restful-communication-over-websocket.html' title='RESTful communication over WebSocket'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-846844914710555711</id><published>2011-06-02T09:20:00.002-04:00</published><updated>2011-06-02T12:31:29.676-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='webrtc'/><category scheme='http://www.blogger.com/atom/ns#' term='rtc-web'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><title type='text'>WebRTC vs Flash Player</title><content type='html'>This year has been great for the world of IP communications so far -- with the Skype deal, Flash Player adding echo cancellation, and now Google open sourcing &lt;a href="https://sites.google.com/site/webrtc/"&gt;WebRTC&lt;/a&gt; (with &lt;a href="http://code.google.com/p/webrtc"&gt;source code&lt;/a&gt;) that includes the audio/video codecs and quality engines. &lt;br /&gt;&lt;br /&gt;RTC-Web is an effort started in the IETF (and Web-RTC in W3C) to standardize the way media streams are transported end-to-end between two browser instances for a real-time communication experience within the browser. It consists of a protocol for establishing end-to-end media path, abstractions for audio/video codecs and devices, and the language elements to use this feature from with Javascript/HTML. Traditionally browser communication has been done using plugins such as Flash Player. I have written a few open source software projects that use Flash based audio and video communication (&lt;a href="http://code.google.com/p/flash-videoio"&gt;flash-videoio&lt;/a&gt;, &lt;a href="http://code.google.com/p/siprtmp"&gt;siprtmp&lt;/a&gt;, &lt;a href="https://sites.google.com/site/vvowproject/"&gt;vvowproject&lt;/a&gt;). The WebRTC effort brings a completely new dimension, in a good way, because now we do not depend on external plugins for web based real time communications. The real-time communication becomes a first class construct to web developers.&lt;br /&gt;&lt;br /&gt;This article summarizes some differences between WebRTC and Flash Player approaches for real-time audio/video communication. It also mentions a separate application approach as described in the VVoW project.&lt;br /&gt;&lt;br /&gt;WebRTC is inline with the evolution of web protocols whereas using Flash Player is like patching an incomplete system. With WebRTC there is no external dependency beyond the basic web browser. However, given the ubiquitous availability of Flash Player compared to basic inter-operating HTML5 features, Flash Player approach is still promising, at least in the short term.&lt;br /&gt;&lt;br /&gt;The number of web developers who understand Javascript/HTML is clearly much more than Actionscript/MXML, which benefits WebRTC approach as there can be many more new applications and use cases implemented in practice. However, the complexity of building Javascript based application combining various individual pieces of the communication elements may be overwhelming. On the other hand existing IDE tools for Flash development take away a lot of complexity from the developers.&lt;br /&gt;&lt;br /&gt;Many users are reluctant to change their browser, and hence getting ubiquitous user adoption may take a long time unless this gets added to Internet Explorer. Moreover, dealing with device interfaces in a portable manner is a challenge. It is also not clear how the devices should be accessed across multiple instances of the same browser or different browser. &lt;br /&gt;&lt;br /&gt;In the past, incompatibility in HTML among browsers has been a nightmare for web developers, and extending HTML for yet another feature is bound to cause more interoperability problems. Two interoperability scenarios are significant: between browsers from different vendors running the same web page, and between two different web sites. The latter is tricky from security point of view if open standards are used because the web site owners would want to restrict communication of its user to another web site user, whereas the protocol will be capable of such communication.&lt;br /&gt;&lt;br /&gt;On the other hand, Flash Player has shown more ubiquitous availability on user's desktops and laptops than any specific web browser. Flash Player allows implementing platform agnostic software because all the incompatibilities between browsers and platforms are taken care by the plugin vendor. &lt;br /&gt;&lt;br /&gt;Flash Player has the ability to do group communication by building scalable application level multicast tree among Flash Player instances. This is useful for one-to-many broadcast type communication scenarios. WebRTC is still in the initial phases of two party communication. Obviously, multiparty communication can be built on top of the two-party communication elements, but requires more effort to achieve efficiency.&lt;br /&gt;&lt;br /&gt;In terms of video codecs, WebRTC provides open source high quality video codec, whereas Flash Player's camera captured video is still in outdated Sorenson codec, which is difficult to interoperate with non-Flash products. Availability of source code enables a WebRTC-based project to add new codecs as needed without depending on the vendor to provide new audio and video codec features.&lt;br /&gt;&lt;br /&gt;The main problems with Flash Player approach is that the protocol for end-to-end media path is proprietary so interoperating with existing VoIP gears is inefficient without buying server pieces from the plugin vendor. Although, interoperability is possible using open RTMP and SIP-RTMP translators, it is not efficient because the browser to translator media path over TCP incurs unnecessary latency for some users. Secondly, for any new feature, we depend on the vendor, for example, echo cancellation, new codec, portability to new device. Luckily, Adobe has been releasing new updates with new features periodically. For example, echo cancellation feature released in Flash Player 10.3 solved a lot of problems for real-time communication. (Please see the public-chat demo in my &lt;a href="http://code.google.com/p/flash-videoio"&gt;flash-videoio&lt;/a&gt; project page to try out the video conference with echo cancellation.)&lt;br /&gt;&lt;br /&gt;Some problems common to both the approaches are: (1) lack of a listening TCP socket or a general purpose UDP socket which could be used to implement a peer-to-peer application protocol within the browser without relying on servers, (2) the scope of an application is within a web page as defined by the Javascript or Flash elements, so if the user navigates to another web page the communication is lost. This is not a problem for web communication use case, but people are generally not used to this model in traditional communication.&lt;br /&gt;&lt;br /&gt;On the other hand, the separate application model as used in the VVoW project allows you to have host resident software for communication, which can be used by any application including a web application running in your browser by connecting to the resident software locally. The resident application can reuse the existing research, e.g., Host Identity Protocol and P2P-SIP. This can save initial setup time for every connection of WebRTC. The main problem is that it involves yet another download and installation by the end user which hampers wide adoption.&lt;br /&gt;&lt;br /&gt;I will continue to explore the WebRTC software developed by Google and try to include it in my open source projects. Some example projects could be: (1) add interoperability between WebRTC and Flash Player for communication in my siprtmp project, (2) add option to detect WebRTC support and use that in my flash-videoio project if available, and fallback to Flash Player, and (3) use the WebRTC source code to implement a separate application with high quality end-to-end media path in the VVoW project, and (4) create a Python wrapper to use WebRTC from within any Python application.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-846844914710555711?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/846844914710555711/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=846844914710555711' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/846844914710555711'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/846844914710555711'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/06/webrtc-vs-flash-player.html' title='WebRTC vs Flash Player'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-9220703664875574725</id><published>2011-04-15T18:58:00.005-04:00</published><updated>2011-04-15T19:58:30.510-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rtmplite'/><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='multitask'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='gevent'/><category scheme='http://www.blogger.com/atom/ns#' term='Python'/><category scheme='http://www.blogger.com/atom/ns#' term='siprtmp'/><title type='text'>Performance of siprtmp: multitask vs gevent</title><content type='html'>Poor performance has been an issue in my &lt;a href="http://code.google.com/p/rtmplite"&gt;RTMP&lt;/a&gt; server and &lt;a href="http://code.google.com/p/siprtmp"&gt;SIP-RTMP gateway&lt;/a&gt;. Traditionally, I blamed the multitask framework for the poor performance. In this article I present my measurement results as well as introduce an alternative &lt;a href="http://www.gevent.org/"&gt;gevent&lt;/a&gt;-based implementation to improve the performance.&lt;br /&gt;&lt;br /&gt;There are several performance aspects of this software, e.g., CPU utilization per call or session, memory usage, bandwidth requirement, etc. This article only focuses on the CPU performance. Moreover, I only consider the steady state CPU usage to measure the number of active simultaneous calls through the gateway. The CPU usage during call setup and termination is not considered.&lt;br /&gt;&lt;br /&gt;The conclusion of my measurement is as follows. The SIP-RTMP gateway software using gevent takes about 2/3 the CPU cycles than using multitask, and the RTMP server software using gevent takes about 1/2 the CPU cycles than using multitask. After the improvements, on a dual-core 2.13 GHz CPU machine, a single audio call going though gevent-based siprtmp using Speex audio codec at 8Hz sampling takes about 3.1% CPU, and hence in theory can support about 60 active calls in steady state. Another way to look at it is that the software requires CPU cycles of about 66 MHz per audio call.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The gevent-based software is also available under the same license for you to try out. The next step to further improve the performance is to move part of the media processing of siprtmp to an external C/C++ extension module.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;b&gt;Background&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Traditionally, I have used the &lt;a href="http://code.google.com/p/python-multitask"&gt;multitask&lt;/a&gt; framework for co-operative multitasking in my Python software including p2p-sip, rtmplite and siprtmp. In the past, people have complained about high CPU utilization in siprtmp for a single call or even with no call. Part of the discussion is documented in &lt;a href="http://code.google.com/p/rtmplite/issues/detail?id=31"&gt;issue 31&lt;/a&gt;. It turned out that the no-call CPU usage was a bug, and that we could optimize the multitask framework to improve the performance by approximately 2x. The optimization alters the way in which the multitask framework looks for io-events and more tasks. In particular, it gives more preference to tasks than to io-events, hence if a single io-event generates multiple tasks, all of them run before waiting for next io-events. These optimizations and fixes are in SVN r60 and r68. Unfortunately, these optimizations are not enough.&lt;br /&gt;&lt;br /&gt;To further improve the performance, I looked at the built-in &lt;a href="http://docs.python.org/library/asyncore.html"&gt;asyncore&lt;/a&gt; module of Python and re-implemented rtmp.py to use asyncore. There was significant improvement of approximately another 1.5x to 2x. Unfortunately, getting timers to work with asyncore is not trivial. Hence I couldn't implement siprtmp easily as the SIP/RTP library relies heavily on timers.&lt;br /&gt;&lt;br /&gt;Then I looked at the &lt;a href="http://www.gevent.org/"&gt;gevent&lt;/a&gt; project, thanks to a co-worker for recommending it. It supports co-routine based co-operative multitasking by modifying the existing blocking modules such as socket. Compared to the multitask framework, the source code using gevent is more readable and easy to maintain because it works behind the scene. Unlike this, the multitask framework requires &lt;span class="Apple-style-span"&gt;yield&lt;/span&gt; statements scattered everywhere and non-trivial &lt;span class="Apple-style-span"&gt;StopIteration&lt;/span&gt; exception to return from a task. I re-implemented siprtmp.py, and related SIP/RTP modules, using gevent. Since siprtmp module includes all of rtmp module, this can also be used as an RTMP server in addition to being a SIP-RTMP gateway.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Test Setup&lt;/b&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;All my tests were done on my MacBook laptop, 2.13 GHz Intel core 2 duo, 2GB memory, and running Mac OS X 10.5.6. I used Python 2.7 for server side components and flash debug player version MAC 10,0,45,2 (&lt;a href="http://kb2.adobe.com/cps/155/tn_15507.html"&gt;how to find?&lt;/a&gt;). I used X-lite version 3 as a standard SIP client. The debug trace on the server was disabled, by not supplying any -d option. All my clients and server ran locally on my local host hence bandwidth was not an issue. I used Mac's Activity Monitor to measure the CPU usage.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Measurement Result&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The main metric is the &lt;i&gt;CPU usage&lt;/i&gt; in percentage as reported by the Activity Monitor. There are several parameters that were altered and the effects were measured. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The siprtmp performance was measured for an audio call between a web-based &lt;a href="http://myprojectguide.org/p/siprtmp/"&gt;VideoPhone&lt;/a&gt; sample application available as part of the siprtmp software, and the third-party X-Lite application. The &lt;i&gt;sampling rate&lt;/i&gt; of the Speex audio codec can be 8kHz or 16kHz. The larger the sampling rate, the larger the encoded packet is. The CPU usage increases with higher sampling rate. Note that there is no transcoding in siprtmp. The following table shows the percentage CPU usage for siprtmp using multitask and gevent, and for the two sampling rates.&lt;/div&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Rate&lt;/td&gt;&lt;td&gt;multitask&lt;/td&gt;&lt;td&gt;gevent&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8 kHz&lt;/td&gt;&lt;td&gt;4.8-5.1%&lt;/td&gt;&lt;td&gt;3.1-3.2%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;16 kHz&lt;/td&gt;&lt;td&gt;6.2-6.5%&lt;/td&gt;&lt;td&gt;4.0-4.1%&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;div&gt;Base on these, we can conclude that the gevent-based SIP-RTMP gateway takes about 2/3 the CPU compared to multitask-based gateway. Roughly, the gevent-based gateway takes about 66 MHz/audio-call of the CPU cycles in steady state.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The rtmp performance was measured using one publisher and zero or more players. The CPU usage increases with the number of players. Typically, audio only session gives less variance in the CPU usage, whereas if video is included then depending on the amount of movement or image details the packet size changes, and so does the CPU usage. I used the &lt;a href="http://myprojectguide.org/p/flash-videoio/test.html"&gt;Flash VideoIO project's test page&lt;/a&gt; to perform the tests. If video is present, then Flash Player's camera capture uses these properties: cameraQuality=80, cameraWidth=320, cameraHeight=240, cameraFPS=12. Audio is always Speex 16 kHz with encodeQuality=6. The following tables shows the percentage CPU usage using multitask and gevent, with one publisher and different number of players, and with or without video. If the variance is small, only the average is reported, whereas if the variance is large the range is listed.&lt;/div&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Media&lt;/td&gt;&lt;td&gt;#players&lt;/td&gt;&lt;td&gt;multitask&lt;/td&gt;&lt;td&gt;gevent&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;2.2%&lt;/td&gt;&lt;td&gt;1.3%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;3.5%&lt;/td&gt;&lt;td&gt;1.8%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;4.5%&lt;/td&gt;&lt;td&gt;2.1%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;5.5%&lt;/td&gt;&lt;td&gt;2.5%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio+Video&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;3.0-3.9%&lt;/td&gt;&lt;td&gt;1.4-1.7%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio+Video&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;4.2-4.7%&lt;/td&gt;&lt;td&gt;2.1%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio+Video&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;5.5-6.3%&lt;/td&gt;&lt;td&gt;2.7%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio+Video&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;7.0-7.6%&lt;/td&gt;&lt;td&gt;3.1%&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;Based on these, we can conclude that gevent-based software takes less than 1/2 the CPU than the multitask-based software for RTMP streaming. Roughly, the gevent-based server takes 34 MHz/publisher and 12 MHz/player of the CPU cycles in steady state.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-9220703664875574725?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/9220703664875574725/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=9220703664875574725' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/9220703664875574725'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/9220703664875574725'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2011/04/performance-of-siprtmp-multitask-vs.html' title='Performance of siprtmp: multitask vs gevent'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2613923624196639377</id><published>2010-12-07T19:03:00.007-05:00</published><updated>2010-12-07T19:24:05.851-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Programming'/><category scheme='http://www.blogger.com/atom/ns#' term='Google App Engine'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Channel API'/><title type='text'>Implementing video conferencing and text chat using Channel API</title><content type='html'>Last week, Google finally released the Channel API [&lt;a href="http://code.google.com/appengine/docs/python/channel/overview.html"&gt;1&lt;/a&gt;, &lt;a href="http://googleappengine.blogspot.com/2010/12/happy-holidays-from-app-engine-team-140.html"&gt;2&lt;/a&gt;] for Google App Engine. It has been available to developers for six months [&lt;a href="http://googleappengine.blogspot.com/2010/05/app-engine-at-google-io-2010.html"&gt;3&lt;/a&gt;], but not on actual app engine for production. I had built a few video conferencing and text chat applications [&lt;a href="http://random-face.appspot.com"&gt;4&lt;/a&gt;, &lt;a href="http://public-chat.appspot.com"&gt;5&lt;/a&gt;] using Flash VideoIO project [&lt;a href="http://code.google.com/p/flash-videoio"&gt;6&lt;/a&gt;] on Google App Engine. Earlier, I had to use Ajax/polling technique to get events related to chat and user list. In the last couple of days, I modified those applications to use the asynchronous event notifications using the Channel API. More text from [&lt;a href="http://code.google.com/p/flash-videoio"&gt;6&lt;/a&gt;] follows:&lt;br /&gt;&lt;br /&gt;"Random-Face [&lt;a href="http://random-face.appspot.com/"&gt;4&lt;/a&gt;]: This is a chatroulette-type application built using the Flash VideoIO component on Adobe Stratus service and Python-based Google App Engine. ... You can view the source code of two files, &lt;a href="http://random-face.appspot.com/static/index.html.txt"&gt;index.html&lt;/a&gt; that renders the front end user interface and &lt;a href="http://random-face.appspot.com/static/main.py.txt"&gt;main.py&lt;/a&gt; that forms the back-end service."&lt;br /&gt;&lt;br /&gt;"Public-Chat [&lt;a href="http://public-chat.appspot.com/"&gt;5&lt;/a&gt;]: This is a multi-party audio, video and text chat application built on top of Python-based Google App Engine and using Channel API for asynchronous instant messaging and presence. ... Developers can see the source code files: &lt;a href="http://public-chat.appspot.com/static/index.html.txt"&gt;index.html&lt;/a&gt; is the front-end user interface, &lt;a href="http://public-chat.appspot.com/static/webtalk.js"&gt;webtalk.js&lt;/a&gt; is the client side Javascript to do signaling, and &lt;a href="http://public-chat.appspot.com/static/main.py.txt"&gt;main.py&lt;/a&gt; is the back-end service code."&lt;br /&gt;&lt;br /&gt;The Channel API essentially implements an XMPP-style asynchronous communication from your server to the Javascript client. I use this to implement notifications for new messages, change in user list, and update of user video session to other participants in the system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2613923624196639377?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2613923624196639377/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2613923624196639377' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2613923624196639377'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2613923624196639377'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/12/implementing-video-conferencing-and.html' title='Implementing video conferencing and text chat using Channel API'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8325969664281589304</id><published>2010-12-02T15:10:00.003-05:00</published><updated>2011-05-28T13:08:49.772-04:00</updated><title type='text'>What is Flash Media Gateway?</title><content type='html'>&lt;div&gt;I recently saw description of Adobe's Flash Media Gateway [&lt;a href="http://labs.adobe.com/technologies/flashmedia_gateway/"&gt;1&lt;/a&gt;] and a related information on how Adobe Connect 8 can use it to make and receive SIP calls [&lt;a href="http://www.adobe.com/products/adobeconnect/tech-specs.html"&gt;2&lt;/a&gt;]. This article lists my view on advantages and problems of such a gateway architecture. (Disclaimer: I have not used any of these products though, so my views may be completely wrong).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary the new Flash Media Gateway is similar to the bunch of other SIP-RTMP gateway products that already existed for few years, e.g., siprtmp, gtalk2voip, flaphone and red5phone. I feel the industry demand of interoperating between Flash Player and SIP devices eventually forced the company to do something about it. Unfortunately, it did something which is sub-optimal as I describe here. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have been involved with development of open source siprtmp project [&lt;a href="http://code.google.com/p/siprtmp/"&gt;3&lt;/a&gt;] hence I can speak from my experience about advantages and problems with such an architecture. I have also blogged earlier about FAQ on using Flash Player to make phone calls [&lt;a href="http://p2p-sip.blogspot.com/2010/02/faq-on-using-flash-player-to-make-phone.html"&gt;4&lt;/a&gt;].&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Advantages of Flash Media Gateway&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;It allows you to build Flash applications that can talk to SIP devices using Adobe's servers at the back end. While it is not useful for those who already have resorted to other solutions such as Red5 and Wowza, it is useful to those who use Adobe's Flash Media Server (FMS) and do not want to switch to other alternatives for any reason. &lt;i&gt;Problem&lt;/i&gt;: It is not clear whether the Flash Media Gateway can work with other media servers such as Red5 or Wowza.&lt;/li&gt;&lt;li&gt;It supports audio transcoding among Speex, Nellymoser and G.711, as well as mixing for a simple conference bridge. This allows working with older Flash players that do not have Speex and with SIP devices that do not have Speex. A third-party product such as siprtmp is typically reluctant to implement transcoding with Nellymoser because of licensing restrictions. &lt;i&gt;Problem&lt;/i&gt;: In general transcoding is not the best option because it takes significant CPU cycles on your (expensive) hosted servers. It can drastically reduce the capacity of your server by a factor, e.g., support 100 calls with Speex or support 10 calls between Speex and G.711.&lt;/li&gt;&lt;li&gt;It supports video using H.264. &lt;i&gt;Problem&lt;/i&gt;: It is not clear whether it allows only one-direction H.264 from SIP device to Flash Player, or whether it supports bi-directional H.264. A bi-directional H.264 will a huge advantage, but will mean that Flash Player is capable of capturing and sending H.264 video, which does not look like the case.&lt;/li&gt;&lt;li&gt;It can potentially support UDP between Flash Player and server. Note that one of the biggest issue with real-time voice calls with Flash Player was that RTMP (over TCP) caused high latency not suitable for interactive communication. Adobe added another protocol, RTMFP (over UDP), that could allow end-to-end media path among the participants thus drastically reducing the end-to-end audio latency. While a gateway architecture does not allow end-to-end media path, it can still allow UDP between Flash Player and media server using RTMFP. This could reduce the end-to-end latency to some extent. &lt;i&gt;Problem&lt;/i&gt;: It is not clear whether RTMFP can be used in conjunction with Flash Media Gateway.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;b&gt;Problems with Flash Media Gateway&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;It does not allow you to build a SIP client in the browser. The communication between Flash application and the media server/gateway is still over RTMP (or RTMFP). This means unlike true end-to-end media path for SIP calls, the media must go through the server/gateway. I don't think the connect plugin is implementing a SIP/RTP stack because it says that it uses the gateway in the back end.&lt;/li&gt;&lt;li&gt;If RTMFP is not allowed for such SIP calls, then the RTMP (over TCP) connection will significantly contribute to latency which is not suitable for interactive voice calls unless you have deployed the gateway close to your user.&lt;/li&gt;&lt;li&gt;Most SIP-PSTN gateways that translate SIP calls to phone network support traditional voice codecs of G.711, G.729, G.723.1 but not Speex or Nellymoser, whereas the Flash Player supports only Speex and Nellymoser for captured voice. Thus you always need a transcoding. Unfortunately, G.711 at 64 kb/s is expensive on bandwidth compared to say G.729 at 8 kb/s. Since the gateway does not support common voice codecs of PSTN providers, in most cases you will need to run some form of transcoding, twice! or live with higher bandwidth usage. &lt;/li&gt;&lt;li&gt;It does not add any more significant value to what already exists with red5phone or siprtmp. You still need to use a third-party SIP provider who can terminate your PSTN calls. It does not optimize the media path latency because of the gateway architecture. And finally it does not really improve the call experience for Flash to SIP calls to the end-user.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Ideally, the SIP/RTP and related protocols should become part of Flash Player, so that it allows one to create a SIP user agent in the browser and enable low latency end-to-end media path with third-party SIP user agents.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8325969664281589304?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8325969664281589304/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8325969664281589304' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8325969664281589304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8325969664281589304'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/12/what-is-flash-media-gateway.html' title='What is Flash Media Gateway?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3656378870729507194</id><published>2010-11-24T14:56:00.009-05:00</published><updated>2010-11-24T17:29:41.871-05:00</updated><title type='text'>How to extend HTML5 for real-time video communication?</title><content type='html'>A few months ago, I was discussing HTML5 with a friend of mine. We tried to figure out what would it take to extend it to support web-based video communication. The proposed HTML5 already includes audio and video tags, but are useful only for streaming video applications. This article presents more refined thoughts on how to extend the browser to support video communication.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;First approach: extend the video tag&lt;/b&gt;&lt;/div&gt;&lt;div&gt;W3C has added new &lt;a href="http://www.w3.org/TR/html5/video.html#video"&gt;video element&lt;/a&gt; in HTML5 to facilitate playback of interoperable video formats across browsers. Existing web sites use "object" element to run an external plugin such as Flash Player for video playback, which is intended to be replaced by the HTML5's video element. This allows browser manufacturers especially for phones and other devices to easily playback web videos, without having to implement the full Flash Player plugin. The "src" property allows specifying the URL of the video to play, and additional properties such as poster, preload, autoplay, loop and controls allow controlling the behavior of the video player.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One way to support video communication is to extend the video element with additional properties that allow it to capture and publish local video, and control camera and microphone behavior. For example, in a two-party call between Alice and Bob, Alice can have two video elements, one to publish local video to URL stream "alice" and other to play remote video from URL stream "bob". Similarly, Bob can have two video elements, one to publish local video to URL stream "bob" and other to play remote video from URL stream "alice". The "src" property can specify the central media server or rendezvous server location as well as the publish or play stream names, e.g., "rtmp://server/conf123?publish=alice".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is the idea behind my &lt;a href="http://code.google.com/p/flash-videoio/"&gt;Flash-based audio and video communication&lt;/a&gt; project. In addition to existing properties such as src, preload, autoplay, loop and controls, it defines new properties for microphone, camera, playing, recording, etc., as you can see on &lt;a href="http://myprojectguide.org/p/flash-videoio/10.html"&gt;How to use the VideoIO API?&lt;/a&gt;. It also overloads the "src" property to allow "rtmp" and "rtmfp" URLs for media server or rendezvous server location, respectively.  This application with its new properties can be used as a drop-in replacement for a video element that supports video communication in the browser.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This approach of extending the existing video element with new properties works well for two-party as well as multi-party conferences, and centralized as well as end-to-end media path. The nice thing about this approach is that it keeps the actual call signaling out-of-scope of the video element, e.g., your web application implements call signaling using existing Javascript/Ajax/websocket/server-event technologies. It keeps the specific rendezvous protocol mechanism such as "rtmp", "rtmfp", and in future "sip" or "rtsp", outside the video element. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To avoid interoperability problems, a minimum subset of supported rendezvous is recommended. The requirements of such a protocol is to support real-time media transport, preferably over UDP, in centralized or end-to-end path in presence of network middle boxes such as NATs and firewalls.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Second approach: define new connection object&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The previous approach integrates capture, playback and connection functions in to a single video element, with additional properties. Alternatively, these functions can be split in to different elements and Javascript objects, e.g., the video element does display/playback, but new camera and microphone objects allow capture, and new connection object allows end-to-end real-time media path among participants. The Javascript application actually connects these different elements and objects to build a complete video communication system.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There are &lt;a href="http://rtc-web.alvestrand.com/papers"&gt;several proposals&lt;/a&gt; on how the new connection or transport API will look like. Example attributes are: protocol (udp or tcp), list of reflectors and relay servers , mode (initiating or receiving),  secure (boolean). Additionally, it has methods such as connect and send, and events to indicate connection status and incoming data. Existing protocols such as &lt;a href="http://tools.ietf.org/html/rfc5245"&gt;ICE&lt;/a&gt;, &lt;a href="http://tools.ietf.org/html/rfc5389"&gt;STUN&lt;/a&gt;, &lt;a href="http://tools.ietf.org/html/rfc3550"&gt;RTP/RTCP&lt;/a&gt; and &lt;a href="http://tools.ietf.org/html/rfc3261"&gt;SIP&lt;/a&gt; may be implemented in the browser or external gateways to support such as transport object. Finally, these transport objects can be piped with display and capture components, audio and video codecs and filters, etc., to implement a complete video communication application. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary, this approach defines new Javascript objects such as Transport, Camera, Microphone, Codec, etc., and allows the application to connect these objects to build a real application. This is more complex than the first approach, but allows fine-grained application logic. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Third approach: use external application&lt;/b&gt;&lt;/div&gt;&lt;div&gt;This approach understands the limitations of HTML and does not try to "add" video communication to it. We are considering this approach of a separate application in our &lt;a href="http://myprojectguide.org/node/158"&gt;web communications project&lt;/a&gt; at Illinois Institute of Technology.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While the idea of extending HTML to support video communication is useful and interesting, there are many limitations. In the past, incompatibility in HTML among browsers has been a nightmare for web developers, and extending HTML for yet another feature is bound to cause more interoperability problems. Browser manufacturers are sometimes not too keen to add a new feature, e.g., for business reasons if it competes with the manufacturer's existing product or service. Third important reason is that the video element of HTML5 lacks some digital rights management related features, which causes media owners to publish their media using restricted Flash application. Fourth, adoption of new HTML5 is slow, so web site developers still need to fall back to Flash-based application for video playback at least in the short term. Finally, adding capture and end-to-end transport components in HTML5 gives rise to a plethora of issues related to privacy, security and denial of service attacks, in case of faulty browser implementation. Due to these reasons many people believe that&lt;i&gt; extending HTML and browsers to support video communication&lt;/i&gt; is not the right approach.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hundreds of applications exist that implement consumer video communication. Some popular ones are Skype, Gmail, tinychat and Facetime. The technology behind these are drastically different, especially for signaling and control. However, at the bottom, every video communication application tends to establish some form of end-to-end UDP-based real-time media path, and fall-back to relays if that fails. As mentioned before, IETF standards exist to establish such media path and relays. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Imagine a standard-compliant resident application, &lt;i&gt;rtc-app&lt;/i&gt;, that runs on user's machine independent of the browser, but allows any application including browser to establish real-time media-path. The browser can use existing API such as websocket or HTTP to interact with rtc-app. The rtc-app application is not owned by a specific vendor, and is installed by the end-user. The avoids re-implementing the feature by every vendor who wants to do real-time video communication. To address the privacy and security concerns, rtc-app must directly ask permission from the end-user before initiating or accepting a connection instead of automatically (and randomly) on API calls. This is similar to how Flash Player asks the end-user for permission to capture from microphone or camera, but can remember the application for future use if told so by the end-user.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The main advantage of this approach is that it does not require changing the browser or HTML, but still is a generic implementation-focussed way to enable real-time video communication for many other applications. If an existing vendor such as Skype or Google opens up its API, it will be a big step forward. While rtc-app can provide transport functions, the audio and video capture still needs to be done somehow. Various codec licensing issues may prevent us from including it in rtc-app, but Flash player based application similar to the first approach can perform capture on its behalf. The main problem with this approach is that it requires an additional download and install by the end-user.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3656378870729507194?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3656378870729507194/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3656378870729507194' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3656378870729507194'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3656378870729507194'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/11/how-to-extend-html5-for-real-time-video.html' title='How to extend HTML5 for real-time video communication?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-316736827064809350</id><published>2010-11-08T14:02:00.002-05:00</published><updated>2010-11-08T15:53:51.187-05:00</updated><title type='text'>How to conduct a technical interview for software engineer?</title><content type='html'>(This article presents my thoughts on how to effectively conduct a technical interview for a software engineering position. It presents the "interviewer's" point of view based on more than 30 technical interviews I have conducted, and quality of candidates I have recommended. If you are an "interviewee"  I suggest you look elsewhere, e.g., &lt;a href="http://kundansingh.com/interview/"&gt;interview questions&lt;/a&gt;.)&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span&gt;&lt;b&gt;Know the position you are hiring for.&lt;/b&gt;&lt;/span&gt; If you have been part of a software engineering team or have read the book, "The mythical man-month", you would know that you need several different "types" of members in a successful team. You need a "magician", who knows or can figure out solution to every technical problem you may have. You need a couple of  "plumbers" who are willing to fix any broken software piece. You need a "general" who is very motivated about what you are doing, knows how and when to delegate, and keeps everyone together.  You need a few "soldiers" who can follow orders, do the job, and be happy to contribute. And so on. As an interviewer, you need to know what position you are hiring for? You need to tailer your interview as per the requirement. One interview pattern does not fit all types.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Do your homework. &lt;/b&gt;Before the interview, thoroughly read the candidate's resume/CV. If she has extensive work experience, identify only one or two of her past projects to focus on. If you have even a slight doubt about her programming ability, prepare a written programming test. If possible, scheduler a separate or additional time slot before your face-to-face interview for the programming test. Do not use any existing online programming test material, otherwise you won't be able to distinguish between someone who knows how to program vs someone who has gone through many web sites containing interview questions. Do not give take home tests. Do not share your programming interview questions with other interviewers in your organization. &lt;/li&gt;&lt;li&gt;&lt;b&gt;Start with knowledge questions.&lt;/b&gt; During the interview, after initial introductions, start with a question on her past experience. Your interview should balance between knowledge and application types of questions. Do not ignore his experience or knowledge, and do not focus &lt;i&gt;only&lt;/i&gt; on his experience. Getting started with what the candidate already knows is also a good way to make her comfortable. You can ask something from her past project, e.g., "Describe in one minute what you did in XYZ?", or ask about a past technology that he used extensively, e.g., "Did you use STL in C++? What are the common STL classes available?"&lt;/li&gt;&lt;li&gt;&lt;b&gt;Focus on real application problems.&lt;/b&gt; Most software engineering positions require applying your existing knowledge to a new problem. The one quality which distinguishes a good programmer from a mediocre programmer is that a good programmer can easily translate your problem in to pseudo-code. If you are interviewing for "soldiers" and not "magician" or "general", avoid discussing high-level design type of problems, but instead focus on more low level real technical problems. For example, instead of asking "How would you design a scalable web server for blah blah?" ask more specific questions. In my experience, people who can answer high level design questions can create "vaporware" but those who can translate a small real problem to pseudo-code can actually write "software". If you need software engineers, avoid wasting time on high level design questions. Also, such application problems should be independent of specific domains but just be able to test whether the candidate has the required mathematical and computer skills to translate your problem to pseudo-code. I have given some examples later.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Follow thought processes and provide hints.&lt;/b&gt; If you believe that the candidate is getting diverted in to incorrect answer, there is no harm to give hints or counter-questions to course correct her thoughts. Do not be too adamant on your answer. Sometimes, a 75% correct answer is good enough. &lt;/li&gt;&lt;li&gt;&lt;b&gt;Provide itemized feedback.&lt;/b&gt; When you submit your recommendation to the HR or your manager about a candidate, specifically itemize individual qualities and performance, and emphasize specific skills and lack of it. For example, "I had a nice 45 min conversation with XYZ, and I found her to be a very good programmer but needs training on Flex. After initial introductions, I asked one algorithm and three programming questions. She did good in two programming ones and average on others. Programming ability: very good; Needs hand-holding: yes; Algorithms: average; Strength: programming; Weakness: Flex; Recommendation: weak accept." My final recommendation is one of strong-accept, weak-accept, weak-reject, or strong-reject, with implied meaning of "a very strong candidate, and must hire her", "a good enough candidate, but won't argue to hire him if others disliked her", "an average candidate, but won't argue to reject him if others strongly liked her", "a poor candidate, and must not hire her", respectively.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;As an interviewer you would be wondering about examples of real questions that would distinguish a good programmer from an average one. These are some examples. As mentioned before, you should create your own question, instead of using these, otherwise you cannot distinguish a candidate who genuinely solved the problem from the one who has read this blog.&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Video conferencing layout: suppose you know the window dimension, WxH, and want to fit participant videos in MxM tile. Each video has fixed aspect ratio of 4:3. All video objects are of same size in the layout. Your MxM tile should be laid out in the middle-center, with potential empty spaces near window edges. The layout should maximize the size of the MxM tile, so that the empty spaces near edges are minimized. You are given an array of video objects V[] and a function layout(v:video, x, y, w, h) which lays out a single video object with size (w, h) at position (x, y) inside the window. Write pseudo-code to layout participant videos. (Hint: start with 1 video, then 2x2, then 3x3, then generalize. Additional questions: how would you modify it to NxM tile instead of MxM? What should happen if number of videos is more then 9 but less than 16 -- which boxes are empty? How would you modify it so that empty spaces including empty boxes are minimized in NxM layout?)&lt;/li&gt;&lt;li&gt;Path optimization: suppose you have a map of a city with Manhattan-style layout. Suppose north-south streets are named, a1, a2, etc., and east-west streets are named b1, b2, etc. Some streets have traffic signals, with 5-second walk sign, 15 seconds count-down to continue walking if started, and 20 seconds don't walk sign, periodically repeating in that order. Other streets do not have traffic signal, in which case traffic must yield to pedestrians. Suppose you need to walk from corner of a5/b5 to corner of a7/b10, and only street with traffic lights on you way is a6. You walk at the same speed. Crossing a6 takes 15 seconds whereas crossing any other street takes only 5 seconds. You do not want to cross a6 if you know you can't finish before it turns to don't walk sign. You want to minimize the time taken from source to destination, hence minimize the time waiting on traffic lights. You have function named walk(), turn(left or right), stop(). Write pseudo-code for your decision process from your source to destination point. (Hint: draw out the map first, then it becomes easy to visualize and solve. Additional question: can you generalize between any two points as long as you know the complete map and which streets have signals?)&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;If you have more ideas, feel free to comment.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-316736827064809350?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/316736827064809350/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=316736827064809350' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/316736827064809350'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/316736827064809350'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/11/how-to-conduct-technical-interview-for.html' title='How to conduct a technical interview for software engineer?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3585910089533952814</id><published>2010-10-19T19:34:00.002-04:00</published><updated>2010-10-19T20:31:49.703-04:00</updated><title type='text'>Why do P2P-SIP?</title><content type='html'>One of the questions I often get: if SIP itself is peer-to-peer, why do P2P SIP?&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Unlike typical client-server web application or Jabber presence system, a SIP user agent is both client and server. Which means, a SIP user agent can send as well as receive a SIP request. In theory, a SIP proxy is not a required component in a SIP system. Ideally, the caller does not need to know whether a call to "sip:bob@some-server.com" will be received by a SIP user agent directly or by an intermediate SIP proxy server first. In practice, people often refer to an intermediate SIP proxy (and registration) server as the SIP server, and such SIP servers are norm rather than exceptions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In theory, once the end-to-end SIP session is established between the caller and callee user agent, the media can traverse directly between the two user agents on the IP network. This is what makes SIP systems support peer-to-peer (or end-to-end) media path. In practice, presence of network address translators and firewalls prevent direct IP connectivity between the two user agents for media path. This requires workarounds such as interactive connectivity establishment or network relay services such as media-proxy.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A bigger problem in practice is that due to business reasons, the VoIP provider does not want the media path to be end-to-end, so that it can have control over the "service" for accounting, billing, advertisement or other reasons.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In short, SIP is a tool that enables peer-to-peer (or end-to-end) media as well as signaling path. Protocols are like tools, which vendors use to build applications and systems, similar to how a construction worker uses various tools to build a house. In the current Internet, many vendors have used SIP to build closed walled gardens of managed services, unlike what SIP was initially envisioned for.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On the other hand, P2P-SIP is a realization of this problem to explicitly define SIP-based communication without using managed servers. Instead of central SIP servers managed by your provider, which can impose constraints to break end-to-end media and communication services, P2P-SIP aims at decentralizing the registration and proxy functions so that the signaling path is peer-to-peer. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The main benefits of P2P-SIP are as follows:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Organizations and services providers can save cost of server maintenance. In particular, there is no new training or position required specifically for a VoIP IT staff. There is no need to host dedicated servers in data centers with 99% availability and pay for energy and bandwidth.&lt;/li&gt;&lt;li&gt;The VoIP industry can move away from traditional service provider oriented business to a more open end-to-end user application. Essentially, it becomes more inline with your other Internet services such as web and email: how web browser don't distinguish between servers or domains, and how you can send email using any mail client and any provider to anybody else.&lt;/li&gt;&lt;li&gt;Finally, the most important reason is that the cost saving eventually propagates to end-user, who can enjoy free VoIP service as long as they are paying for their IP network access. Peer-to-peer infrastructure enables highly scalable communication system such as Skype at a very low cost. A small vendor's VoIP is able to scale to millions of users only if it can save cost of server maintenance and bandwidth.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;The important aspects of P2P-SIP are: &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;It does not depend on a VoIP service provider for signaling and media path. Hence there is not much money for managed services in P2P-SIP.&lt;/li&gt;&lt;li&gt;It can use end-user devices on public Internet to relay media path for end-users behind restricted networks. Hence it requires incentive for public end-users to help restricted end-users.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;As you can see, benefits of P2P-SIP are for end-users, at the cost of service provider businesses. Most of existing VoIP effort is driven by large corporations and carriers who do not have any interest in making the service open to end-users and lose control over them. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3585910089533952814?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3585910089533952814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3585910089533952814' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3585910089533952814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3585910089533952814'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/10/why-do-p2p-sip.html' title='Why do P2P-SIP?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8504735702547749378</id><published>2010-10-16T14:08:00.003-04:00</published><updated>2010-10-16T15:55:07.124-04:00</updated><title type='text'>Theory vs practice of SIP-based VoIP</title><content type='html'>I recently attended the VoIP conference and expo [&lt;a href="http://www.cvent.com/EVENTS/Info/Custom.aspx?cid=22&amp;amp;e=e2f0ff38-a913-4f21-a842-58e29285fafa"&gt;1&lt;/a&gt;], at Illinois Institute of Technology, organized by Prof Carol Davids, and also got a chance to speak on a couple of topics [&lt;a href="http://kundansingh.com/#talks"&gt;2&lt;/a&gt;]. There were many interesting presentations in the conference giving perspectives from leading software and equipment vendors, carriers and service providers, government bodies, standardization forums as well as open source developers. This article presents some of my thoughts regarding the theory and practice of SIP-based VoIP.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The inaugural session showed a demonstration of IP-based 911 call by students by integrating and using the software pieces developed at other universities. It was great to see sipd [&lt;a href="http://www.cs.columbia.edu/irt/cinema/doc/sipd.html"&gt;3&lt;/a&gt;] being used by other universities for exciting new projects, and brought back the memories when we were developing sipd.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Theory&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The session initiation protocol (SIP) was invented to create and control multimedia sessions on the Internet because the previous protocols were either insufficient (e.g., HTTP, RTSP) or too complex (e.g., H.323). Unlike HTTP which typically requires a dedicated server, SIP was designed to be more peer-to-peer. Hence, your VoIP phone itself acts like both client as well as server to send and receive SIP requests. Ideally, you do not need to keep the SIP proxies in the session path, except for initial call setup. The protocol is designed to encourage subsequent requests such as ACK, BYE or re-INVITEs to be end-to-end, if possible. The protocol includes mechanism to enable an intermediate proxy to require that all subsequent requests in a session be sent via that proxy. But this was designed to be used as an exception rather than a rule. The motivation is scalability: keep the proxies lightly loaded.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In theory, a SIP-based system follows the end-to-end principle of Internet: keep the intelligence in the end, and have the network (or intermediate proxies) be dumb. The end-to-end principle has been crucial in the success of the Internet and more recently the web applications. As long as you keep the network provider independent of your application provider, you see explosion of application innovations.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Practice&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In practice, your SIP-based system is typically "owned" by your network provider who has business incentive to provide you billable applications and services, and prevent you from talking to other open SIP-based systems without going through their billed "services". The largest SIP systems such as Comcast and Verizon digital voice are designed to be closed systems which use SIP in the network without allowing end-users to access it directly. More recently, Apple's Facetime is a closed service and does not inter-operate with other SIP services. With SIP-based IP multimedia subsystem (IMS) being adopted by wireless carriers, there is more incentive for businesses to convert SIP from an end-to-end protocol to a network centered and managed service.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One of the term I kept hearing during the VoIP conference was the "managed" services, and why it is useful for consumers, and what kind of new innovations are happening. When I looked at details, these services and innovations are basically what SIP already provides, e.g., service APIs, unified communications, etc. I had the opportunity to work on some of these 5-10 years ago when I was at Columbia University. It is discouraging to know that because of narrow minded business incentives of vendors and carriers, walled gardens of SIP systems are created which prevent open innovations and require significantly many fold effort to get basic features to the consumers. First, (1) the vendors and carriers use an open protocol, SIP, to build a VoIP system, then (2) invest resources in making it a walled garden, and finally (3) invest more money and resources to create federation of these walled gardens. In the end, (2) and (3) nullify each other, and just (1) was sufficient. All the money invested in (2) and (3) gets wasted in the long term.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Conclusion&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I would like to advise vendors and carriers to just focus on providing a good IP network with some quality of service and less restricted NATs, and leave the rest of the VoIP services to the millions of application developers. As Henry Sinnreich said during the conference, the only service is "the Internet". Instead of providing "managed" network services, open up your network for end-to-end innovations. In the long term this will boost your network and bring more revenue. With Internet and web, there is more opportunity for everyone, and a walled garden approach in the network is just going to keep you away from the long term growth. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The open innovations in VoIP are bound to happen. If you do not want to be part of that, someone else definitely will. Adrian Georgescu presented Blink [&lt;a href="http://icanblink.com/"&gt;4&lt;/a&gt;], a fully featured easy to use SIP user agent, as a great example of SIP-based open innovation. For every VoIP developer in "managed" service organization, there are probably a thousand independent developers such as web application developers. Sooner or later, these developers will build some open platform or system which will attract hundred times more traffic than your managed services and hence many fold more revenue. At that time all your investment in "managed" services will go down the drain. Because if you don't open up your SIP systems, these developers will not wait for it, and build something else. This has happened before with Internet and web applications, and will happen again sooner or later with Internet voice/video communication or VoIP.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8504735702547749378?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8504735702547749378/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8504735702547749378' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8504735702547749378'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8504735702547749378'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/10/theory-vs-practice-of-sip-based-voip.html' title='Theory vs practice of SIP-based VoIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8253968931330634940</id><published>2010-10-09T02:03:00.004-04:00</published><updated>2010-10-09T03:09:31.699-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Programming'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Python'/><category scheme='http://www.blogger.com/atom/ns#' term='Protocols'/><title type='text'>Tips for implementing application protocols</title><content type='html'>This article presents some tips for implementing application protocols such as for web services, multimedia communication, streaming or Internet telephony. The tips are mostly relevant for implementations in the Python programming language.&lt;div&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;Keep all blocking operations outside your protocol implementation.&lt;/b&gt; This mostly includes sockets, files and timers. If you design your protocol parser and controller to be independent of blocking calls, then it can easily be converted to various asynchronous or synchronous controllers as needed.  For example, the &lt;a href="http://code.google.com/p/p2p-sip/source/browse/trunk/src/std/rfc3261.py"&gt;rfc3261.py&lt;/a&gt; module implements core SIP stack using the Stack class. The application supplies API for timer creation, message receiving as well as sending. When the application receives a packet on socket, it invokes a method on the stack. When stack has parsed the received packet and needs to inform an high-level event such as incoming call to the application, it invokes a method on the application. This allows the application such as &lt;a href="http://code.google.com/p/p2p-sip/source/browse/trunk/src/app/voip.py"&gt;voip.py&lt;/a&gt; to provide co-operative multitasking based controller. On the other hand, the built-in HTTPServer in Python includes synchronous and blocking calls for sockets and disks. This makes the built-in class' HTTP implementation hard to use for various high-performance application that cannot afford to block. Due to this, almost every web framework implements its own HTTP, instead of re-using the built-in class. The trade-off is that your implementation may become more involved if you keep blocking operations outside.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Do not use multi-threading in your protocol implementation.&lt;/b&gt; Firstly, getting a multi-threaded application right is very hard. Secondly, for CPU intensive tasks or disk I/O bound tasks, the CPython's global interpreter lock (GIL) will prevent efficiency anyway; hence multiprocessing should be used. Thirdly, for network I/O bound applications multi-threading has advantage, but not as much as multiprocessing. Consider using multiprocessing, but beware of cross-platform problems, especially on Windows! In my experience, co-operative multitasking (or green-thread) works best for protocol implementation. If you are worried about efficiency on multi-core CPUs, you should leave that decision to the main application that will use your protocol implementation to present a client or a server application. The main application can decide whether to use multi-threading or multiprocessing and co-ordinate among them.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Decouple the protocol parsing and handling implementation.&lt;/b&gt; Sometimes you may need to use just the parser without the handler. For example, if a single incoming TCP connection can have either HTTP, SIP or RTSP messages, then it becomes easier for the application to first parse the message to determine what it is, and then invoke the appropriate handler. Because of NAT and firewall, many application protocols need to be sent via a single port, e.g., 80 or 443. If an application from Flash Player connects to your server on TCP, it will first send a socket policy request, before sending any other actual application protocol message. If your protocol parser is separate from the handler, you can invoke the socket policy request parser as well as protocol parser, to determine what request it is.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Avoid blocking on DNS lookup, if possible.&lt;/b&gt; This goes back to first point; do not block in your protocol implementation. Usually it is hard to notice the DNS lookup as blocking. Most built-in libraries provide synchronous and blocking calls for DNS lookup. Consider using some asynchronous DNS library. If that is not possible, move the DNS lookup out of the core protocol implementation, to the main application. Sometimes DNS lookup is done during logging, e.g., to convert client IP address to host name, and may be hard to detect.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Log all warning, errors and exceptions.&lt;/b&gt; In server implementations, you may get tempted to handle various exceptions and ignore it, to make your server more "robust". Unfortunately, this practice leads to more headaches later on when some critical bug appears but is hard to detect. If you log all warning, error or exception conditions, even if you ignore them, you may be able to detect such bugs early on. A warning is a suspicious behavior either in your code or external system. An error is a failure case due to some external problem, e.g., file requested by client was not found on server. An exception is most likely a programming mistake, e.g., accessing attribute on "NoneType".&lt;/li&gt;&lt;li&gt;&lt;b&gt;Do not hold on to resources.&lt;/b&gt; With automatic reference counting and garbage collection, it becomes your responsibility to free up any unused references. Typically the application protocol defines how long the resources should be kept, e.g., how long a SIP transaction lasts. But there are some resources which can persist for much longer duration, e.g., user contact location. External databases are more suitable for such resources. Secondly, with event driven software architecture such as listener-provider model, it is easy to get in to reference loops, e.g., listener has reference to provider and vice-versa. Similarly, a Message object may have list of Header objects, and each Header may refer back to the Message. Your cleanup code should correctly free up unused references, e.g., "del varname".&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8253968931330634940?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8253968931330634940/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8253968931330634940' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8253968931330634940'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8253968931330634940'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/10/tips-for-implementing-application.html' title='Tips for implementing application protocols'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8984188556967493926</id><published>2010-09-22T23:43:00.003-04:00</published><updated>2010-10-09T02:03:18.171-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='server'/><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><title type='text'>Scalability vs Performance</title><content type='html'>&lt;span class="Apple-style-span" &gt;I have been reading articles on scalability and performance. This article summarizes some of my understanding about this topic.&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Scalability is the ability to scale the system to higher load. Performance determines the throughput of the system under load [&lt;/span&gt;&lt;a href="http://blah.winsmarts.com/2010-1-Scalability_Vs_Performance.aspx"&gt;&lt;span class="Apple-style-span" &gt;1&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;]. In theory, scalability and performance are orthogonal; you can handle higher load either by scaling the system or by improving the performance of individual components of the system. In practice, scaling and performance improvement are used together to improve the overall system. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Suppose a single machine can handle a load of N. If it is possible to handle 2N load by adding another machine, or kN load by adding another k-1 machines, then the application is designed to be scalable. On the other hand, you can always try to optimize your application or buy more expensive hardware to make your application handle 2N load in the single machine. Clearly there is a limit to the performance gain on a single machine. Also, for the same amount of overall improvement, typically scaling the system by adding redundancy is much cheaper than improving the performance of single machine by optimization or buying more expensive hardware. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;This seems to indicate that scaling should always be preferred. Unfortunately the problem is that designing your application for scalability is not trivial. As an example, Google AppEngine (GAE) is designed to be scalable, but not necessarily high-performance [&lt;/span&gt;&lt;a href="http://highscalability.com/google-appengine-second-look"&gt;&lt;span class="Apple-style-span" &gt;2&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;]. On the other hand, rational database such as MySQL can be optimized for high-performance, but designing your application to scale with MySQL is a challenge. In most web applications, typically the database server eventually becomes a bottleneck at high load. On the other hand Google's Bigtable is designed to be scalable. The tradeoff is that GAE API does not allow many relational database features such as join and hence requires the programmers to learn a new way of data storage and access.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Horizontal scalability refers to adding more machines to handle the load, whereas vertical scalability (which we call high-performance) refers to adding hardware components in existing machines such as more memory or better CPU to handle higher load [&lt;/span&gt;&lt;a href="http://technet.microsoft.com/en-us/library/bb687366.aspx"&gt;&lt;span class="Apple-style-span" &gt;3&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;]. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;&lt;span class="Apple-style-span" &gt;High Scalability Techniques&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Partitioning the data is most common scalability technique. It allows you to distribute different partitions on different servers. Consistent hashing has been used in distributed hash tables and distributed server farm to assist partitioning and replication of data in the presence of high churn when machines come and go frequently. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Stateless servers are much more scalable than stateful, because stateful servers may need to communicate with each other or share state which limits the scalability. Web servers and SIP proxy servers are easy to make stateless, whereas conference servers, presence servers or gateways are difficult to make stateless. Many applications too require stateful processing at the server, e.g., web applications that need stateful database storage. This concept can be used together with partitioning to build a two-stage server farm where first stage stateless servers just do load balancing whereas second stage stateful servers work on a small data partition. Unfortunately, some applications such as presence or publish-subscribe are too complex for easy data partition.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;&lt;span class="Apple-style-span" &gt;High Performance Techniques&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;The C10K problem [&lt;/span&gt;&lt;a href="http://www.kegel.com/c10k.html"&gt;&lt;span class="Apple-style-span" &gt;4&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;] talks about the typical web server limitation of only about ten thousand simultaneous connections due to operating system and software constraints, and presents several references to improve the performance. The usual software performance bottlenecks are data copies, context switches, memory allocation and lock contention. Various techniques to handle these problems are summarized in [&lt;/span&gt;&lt;a href="http://pl.atyp.us/content/tech/servers.html"&gt;&lt;span class="Apple-style-span" &gt;5&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;].&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Asynchronous and non-blocking IO are commonly used to convert blocking/synchronous methods to event-based. Although asynchronous and non-blocking refer to almost the same thing, there are certain crucial differences in the API [&lt;/span&gt;&lt;a href="http://stackoverflow.com/questions/2625493/asynchronous-vs-non-blocking"&gt;&lt;span class="Apple-style-span" &gt;6&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;]. Non-blocking refers to making your methods not block and hence return immediately, e.g., with an error code indicating that the method is not complete. Typically, additional method is available to know the state of the IO. For example, socket API allows non-blocking mode, and can use select to check the state of the socket, whether read or write can be done or not. Thus, the application program has full control of when the read is done and in which thread/stack. On the other hand, asynchronous API are more event-based, where the application registers a method handler for an event, and the system calls the method when that event occurs from within the system thread, or posts that event to the main application's handler loop.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;A well known topic of debate is whether event-based or threads are more suitable for high performance servers? Theoretically, both are equivalent with non-preemptive threads and co-operative multi-tasking. But in practice due to the way threads are implemented and resources needed by threads, event-based systems have performed better on single CPU machines. Unfortunately, pure event-based systems are difficult to take advantage of multi-CPU hardware. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;Thread-pool and process-pool have been used to improve the system performance and take advantage of multi-CPU hardware. Both multi-process and multi-thread systems have been built in practice. The advantage of multi-process implementation is that multiple processes can listen for incoming connections on the same socket, whereas in multi-thread implementation only one thread can be listening on a socket. The problem in multi-process implementation is that it needs explicit inter-process communication using message passing or shared memory, whereas in multi-thread implementation it is easy to use global variables with mutex and conditions to share state. With respect to event-based systems, there are two design patterns: a reactor pattern allows the application to register for "ready" event and perform the read operation when event is received; a proactor pattern allows the application to register for "complete" event and receives the incoming data along with the read event [&lt;/span&gt;&lt;a href="http://www.artima.com/articles/io_design_patterns2.html"&gt;&lt;span class="Apple-style-span" &gt;7&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;]. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;The thundering herd problem in OS is that when an IO event is received all the waiting threads are woken up. But only one thread will handle the event and others will go back to sleep. This wastes CPU cycles. The problem and a solution is proposed in [&lt;/span&gt;&lt;a href="http://www.citi.umich.edu/projects/linux-scalability/reports/accept.html"&gt;&lt;span class="Apple-style-span" &gt;8&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;].&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" &gt;For a high-performance server implementation, general consensus is to always use non-blocking IO, and use thread or process pool with minimum number of threads/processes. The idea is that there should be one-thread/process per CPU. This paper [&lt;/span&gt;&lt;a href="https://mice.cs.columbia.edu/getTechreport.php?techreportID=507"&gt;&lt;span class="Apple-style-span" &gt;9&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" &gt;] describes a SIP server architecture which can maintain few hundred thousand active TCP connections. For pure network IO it is possible to always use non-blocking IO on commodity hardware, whereas for disk IO it is not so easy. Hence, thread-pool model with worker threads to wait on disk IO completion have been used with success in the past. &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8984188556967493926?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8984188556967493926/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8984188556967493926' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8984188556967493926'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8984188556967493926'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/09/scalability-vs-performance.html' title='Scalability vs Performance'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7306518558229174618</id><published>2010-09-21T15:27:00.005-04:00</published><updated>2010-09-21T19:41:37.493-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Programming'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><title type='text'>Lessons in starting a software project</title><content type='html'>&lt;span class="Apple-style-span"  style=" ;font-family:Times;"&gt;&lt;div style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 3px; padding-right: 3px; padding-bottom: 3px; padding-left: 3px; width: auto; font: normal normal normal 100%/normal Georgia, serif; text-align: left; "&gt;This article presents my thoughts on DOs and DONTs of starting a new software project. Many lessons listed in this article are already well known or common sense, but usually not always followed!&lt;/div&gt;&lt;div style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 3px; padding-right: 3px; padding-bottom: 3px; padding-left: 3px; width: auto; font: normal normal normal 100%/normal Georgia, serif; text-align: left; "&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 3px; padding-right: 3px; padding-bottom: 3px; padding-left: 3px; width: auto; font: normal normal normal 100%/normal Georgia, serif; text-align: left; "&gt;DOs&lt;br /&gt;&lt;div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;Brainstorm often: &lt;/b&gt;During the initial phases of software growth or even before starting to write a single line of code, you should do several sessions of brain storming. It could be on validating your idea, figuring out competition, predicting the future, picking a programming language, potential learning, etc. This is the difference between carefully planned birth versus unexpected pregnancy. Just because you &lt;i&gt;can&lt;/i&gt; write some software, should you? Especially if better alternatives exist?&lt;/li&gt;&lt;li&gt;&lt;b&gt;Use good version control system&lt;/b&gt;: Even for the most trivial projects, you should try to use version control system. I like SVN (subversion) for my open-source projects, but if you can afford git, it works better for complex project management. If you are starting an open source project, consider code.google.com for hosting your SVN repository -- it is fast, simple and hassle free. It is like a good home for your baby software.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Document all ideas:&lt;/b&gt; When the software is evolving you will have many ideas for new features, doing things differently, or incorporating competing features. Obviously due to lack of resources and time, you won't be able to incorporate all these. But you must document all the ideas, and if possible prioritize them. Keep a single list of ideas. Usually the software will evolve on its own to attract new features. Implement only the most crucial ideas and features, and resist the temptation to add many features.&lt;/li&gt;&lt;li&gt;&lt;b&gt;Few developers during growth:&lt;/b&gt; Keep the core set of excellent developers to one, two or at most three when the project is growing. Every major piece of software should have only one excellent developer. This avoid unnecessary friction and induces feeling of ownership. Software is like a baby, which needs a good parent to raise and grow, before it can mature and face the world. You wouldn't want to raise your software in a foster house where nobody feels ownership, i.e., in an organization with an engineering "team".&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Pick the right language and tools: &lt;/span&gt;Every programming language has some strengths and weaknesses. Make sure you select the right language, that is quick to develop with and maintain, and works well for your target application. For example, with low-level C/C++ you get performance, and with high-level Java, Python, you get portability. Over the years I have liked Python for most of my applications. Unfortunately, in corporate environment, Java is the pet-child because there are many fold more software developers and managers who know Java well. For modern Internet and web applications, Python, Ruby, Erlang and ActionScript are becoming more popular.&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-weight: bold;"&gt;Include testing and defensive programming:&lt;/span&gt; To be successful, sooner or later your software project will need to get out of the demo-mode and face the real world. It might become too late at that time to worry about scalability or glaring bugs if those involve redesigning your software. It saves a lot of time and energy to use common techniques such as good logging, unit testing, performance best practices, and defensive programming from day 1. Also maintain an issue tracker and log even the tiniest of issues with your software. Sooner or later you will need to address them.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;DONTs&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Don't procrastinate: If you have an idea to work on, don't procrastinate. Just get started, write something up, try to get a prototype going. Most successful projects need a complete re-write at least once. So don't be afraid to write throwaway code. &lt;/li&gt;&lt;li&gt;Don't document before coding: While software engineering people will say that you should follow good software process -- writing requirements specification, design document, test cases, etc. -- those can be written later too! Source code is what makes or breaks a software. You can write detailed specification and design documents, after you already have a prototype and want to document it or propose a change. In my experience, any design document written before writing the code is incorrect, and needs to change drastically after the source code is written.&lt;/li&gt;&lt;li&gt;Don't spend time on one-off items: For your software, there are some items which are directly related, and then there are one-off items. For example, for a VoIP client, the protocol implementation, good voice quality, etc., are directly related. On the other hand, having a user signup page, instant messaging text chat, file sharing, etc., are one or two-off items, which are not directly related, but indirectly assist users in VoIP. When you start a project, do not spend time doing one-off items, but work on directly related items first.&lt;/li&gt;&lt;li&gt;Don't wait too long for 1.0 release: There is 80% difference between an 80% complete software and a released software. When you formally release your software, you have to take care of user manual, getting started guide, installer as well as finish those last annoying bugs. In the case of software projects, it is very easy to get started but very difficult to put an end. There is always an endless list of features which needs to be completed before the release, and hence your release never happens. Unless, you make it happen. You will have to make a firm decision about what bugs are important and what can remain as known issues for version 1.0.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7306518558229174618?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7306518558229174618/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7306518558229174618' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7306518558229174618'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7306518558229174618'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/09/lessons-in-starting-software-project.html' title='Lessons in starting a software project'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-365013672620500341</id><published>2010-08-08T20:55:00.000-04:00</published><updated>2010-08-09T01:07:21.430-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='Conferencing'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>Flash-VideoIO: Flash-based audio and video communication</title><content type='html'>&lt;p&gt;I launched the &lt;a href="http://code.google.com/p/flash-videoio"&gt;Flash-VideoIO&lt;/a&gt; project to facilitate audio and video communication using easy-to-use reusable Flash application with extensive JavaScript API. More from the project page...&lt;/p&gt;&lt;p&gt;"Flash-VideoIO is a reusable generic Flash application to record and play live audio and video content. The Flash-VideoIO project aims at implementing a generic Flash application named VideoIO.swf which can be used for variety of use cases in audio and video communication, e.g., live camera view, recording of multimedia messages, playing video files from web server or via streaming, live video call and conferencing using client-server as well as peer-to-peer technology."&lt;/p&gt;&lt;p&gt;Developers are invited to explore and experiment with the VideoIO component, provide feedback and/or contribute to the development.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-365013672620500341?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/365013672620500341/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=365013672620500341' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/365013672620500341'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/365013672620500341'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/08/flash-videoio-flash-based-audio-and.html' title='Flash-VideoIO: Flash-based audio and video communication'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7806683339093952258</id><published>2010-03-31T22:39:00.010-04:00</published><updated>2010-04-26T14:02:00.311-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='server'/><category scheme='http://www.blogger.com/atom/ns#' term='Programming'/><category scheme='http://www.blogger.com/atom/ns#' term='reliability'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><title type='text'>Distributed Systems Development: Client vs Server</title><content type='html'>In this article I compare the distributed systems development for client vs server. When you start implementing a distributed system such as a client or server for some protocol, the basic functionality is easy to implement. But to make your software usable in real world, the client or server specific considerations take a lot of time. This article tells you &lt;i&gt;how to build good quality distributed systems: client or server&lt;/i&gt;.&lt;br /&gt;&lt;h4&gt;Client&lt;/h4&gt;&lt;i&gt;Considerations&lt;/i&gt;: Auto-configuration, IP address change detection, NAT and firewall traversal, robustness against failures, adapt to network condition, consistent user interface and view, command line vs user interface, guaranteed security, idle and sleep detection, responsiveness of user interface, redundant connections to servers, keep-alive, caching, analytics.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;i&gt;Examples&lt;/i&gt;: Firefox browser, Skype, Gtalk&lt;/div&gt;&lt;div&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Description&lt;/i&gt;: A client should automatically configure as much as possible, e.g., network IP, hostname, username, machine type, etc., from system. If the client relies on local IP address, it should automatically detect any change in IP address. For example, a SIP client should re-register the new IP address as contact if there is a change. NAT and firewall traversal is one annoying reality on the Internet. Most often an HTTP based client works out of the box because most networks are permissive of HTTP and HTTPS. However, if you are building any other client such as IM and chat, VoIP or media recording, then there is some network in some enterprise which will block your connection. Most protocols have an alternative to perform NAT and firewall traversal. For example, RTMP has RTMPT, XMPP can work on BOSH, and SIP uses bunch of techniques. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A client should be able to adapt to any network condition. This not only applies to the network topology and filtering, but also to the bandwidth and quality. A VoIP client should automatically adapt to lower quality codec if it detects lower end-to-end bandwidth. Bandwidth detection and adaption should be a continuous process instead of performing only at the beginning. If you need to connect to a server, and there are many distributed servers, the client could periodically detect a list of closest servers, and connect to one or more of the closest servers in network proximity. Where network proximity is determined by network distance or delay and jitter. This allows your client to handle geographic distribution. If you have multiple redundant servers, you client should be able to failover in case one server fails. A better approach could be to keep persistent connections to two servers, so that failover latency is minimized. The automatic configuration, detection and adaption of various network and system conditions is one of the most crucial property of successful peer-to-peer clients such as Skype. Some clients need detection of idle or sleep behavior, e.g., to update your presence status. If the user puts the system on sleep (or standby) then your software may not get any chance to communicate to the server about the status. In such cases, your protocol or server should be robust in detecting idle clients.&lt;br /&gt;&lt;br /&gt;A client is a user facing software. The responsiveness of the client user interface distinguishes a good software from an average one. For example, if the client doesn't get a server response within 200 ms, it can automatically inform the user via an hour glass or rolling wheel indication. If your GUI becomes unresponsive while it is "processing" instead of giving an indication, then user is likely to get annoyed or make mistakes clicking on the same button multiple times. You should always use event-based system for your user interface, instead of synchronous processing especially if it can block. Caching can be used if needed to speed up your performance. For example, instead of fetching the user list to display in your client every time you switch to the user list view, you can cache it and display a cached copy. Periodically, refresh your local cache with the actual data or from server. Caching is also useful in other places where client-server communication becomes overloaded. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Command line clients are becoming less common these days. But such clients are more powerful in some scenarios. Consider whether a command line alternative is useful and feasible as well. Finally, a guarantee on security is a must for the Internet client applications. Most application protocols define secure communication, e.g., over TLS/SSL, S/MIME, etc. Your client should have an option to go completely secure and encrypted.&lt;br /&gt;&lt;br /&gt;In summary, a good client software is one which can do one thing that it is meant for. You may add many new features, but how you do the essential function is what will make your client useful and popular. Consider using analytics in finding which feature is gaining popularity, or which feature is no longer used. A software is like a human body. If you don't do exercise to remove body fat -- remove unused pieces and re-factor periodically -- you will become too fat, slow and useless. This is more important for client facing software, because client behavior keeps changing and what you used last year may not be the same client this year.&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;h4&gt;Server&lt;/h4&gt;&lt;i&gt;Considerations&lt;/i&gt;: Easy configuration, logging, vertical and horizontal scalability, robustness and automatic failover, auto loading of configuration changes, connectivity to different backends, programmability, event based but multi-threaded, use multi-core CPUs, memory usage optimization, management console, command line control, activity monitoring, admission control for quality, stateless vs stateful, replication of critical data, partitioning of data for scalability, caching, keep-alive for crash detection of server, detection of idle or unresponsive clients.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Examples&lt;/i&gt;: Apache web server, ejabberd, SIP express router&lt;/div&gt;&lt;div&gt;Anti-example: Tomcat, Flash Media Server&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;Description&lt;/i&gt;: A server should have explicit, easy and extensive configuration option so that it can be deployed on variety of different scenarios, e.g.,  Apache config file. Note that when it comes to configuration: explicit is better than implicit, easy is better than complex. Another important feature of the server is being able to load the configuration changes without having to kill the server. For example, Tomcat automatically detects new war files and re-deploys the applications. Apache web server can be made to re-read the configuration using Unix signal. Some servers take the configuration to an extreme by defining an easy to use script that controls the server behavior. For example SIP express router defines a perl-like programming script to handle incoming request, forward to telephony gateway or perform authentication. Such fine grained configuration allows deploying the server in variety of different environments -- from personal use to enterprise or carrier deployments. On the other hand, I find J2EE model of defining services and classes in XML configuration files hard to use. Even though the configuration is done by configuration file or script, a easy to use web based management console gives a clean interface to the server control and monitoring. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Easy to use and configurable logging is another crucial piece of server software. A server log is typically the first place you go when you detect a problem. There is a tradeoff between extensive logging vs selective logging. I prefer extensive logging with selective viewing. Also I prefer accessing log from command line using "tail -f logfile.log" instead of the variety of web based log viewers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Scalability and robustness are part of good server design. There are many other articles and web site dedicated to discussion on this, e.g., &lt;a href="http://highscalability.com/"&gt;highscalability.com&lt;/a&gt;. There are several techniques such as event based thread pool, connectivity to different backends, bi-directional master-slave databases, replication of critical data, in-memory distributed cache such as memcache, partitioning of data, two-stage load sharing architecture, and use of servers from different vendors for robustness against security exploits. The server should prefer stateless operations. It should be able to detect unresponsive clients in case of stateful sessions, e.g., by periodically sending keep-alives. Note that a server initiated keep-alive is more robust than a client-initiated for distributed applications. For example, in client initiated keep-alive, if client1's keep-alive fails, client1 assumes it is disconnected, but client2 doesn't know that client1 is disconnected; whereas in server-initiated keep-alive, once the server detects that client1 is disconnected, it can inform other related clients about it.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The server should use the available resources in the best possible way. Typically memory, CPU and bandwidth are the critical resources. Some form of activity monitor should detect the resource usage by the server and inform the concerned IT person in case of abnormal behavior. This could be because of memory leak in the server or some security attack from malicious systems. Obviously the implementers should strive to fix any memory leaks. Another useful behavior by the server is to do admission control based on available resources. For example, if the server detects that it is using 90% of its bandwidth, then it should not admin a new media streaming client, of if it detects it is CPU is fully utilized, it should reject new requests with appropriate error response, so that client retries with exponential back-off timeouts. In a distributed server farm, the servers should be able to not only automatically configure based on configuration of other servers, but also detect overload on and share load from other servers in the farm. For example, a self organizing server can detect other servers in the farm, and automatically assume load sharing and/or secondary server responsibility.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary, configuration, scalability and robustness form the core of a good server implementation. &lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7806683339093952258?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7806683339093952258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7806683339093952258' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7806683339093952258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7806683339093952258'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/03/distributed-systems-development-client.html' title='Distributed Systems Development: Client vs Server'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2913191424614156907</id><published>2010-03-22T15:18:00.006-04:00</published><updated>2010-03-22T16:20:04.835-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Programming'/><title type='text'>What Great Programmers think?</title><content type='html'>I found a very interesting &lt;a href="http://www.stifflog.com/2006/10/16/stiff-asks-great-programmers-answer/"&gt;blog article&lt;/a&gt; and wanted to summarize the great programmers' view!&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1. What is the most important skill every programmer should posses?&lt;/div&gt;&lt;div&gt;Good "taste". Communication skills and expression in writing. Strong sense of value of what you are doing is worth. Concentration. Passion. Self motivation. Think clearly. Prefer evidence over intuition. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;2. What will be the next big thing in computer programming?&lt;/div&gt;&lt;div&gt;Web application programming will replace any GTK, Java Swing, Qt, Win32, MFC, etc. Real AI can change the current incremental trends in programming. Large-scale distributed processing. But many great programmers admit that they can't and don't want to predict future.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;3. Why are some programmers 10 or 100 times more productive than others?&lt;/div&gt;&lt;div&gt;Ability to restate hard problems as easy ones. Genius is one percent inspiration and ninety-nine percent perspiration. Ability to fit the whole problem in their heads at ones. Care about what they do. They don't rush and slap things together, but have holistic picture of what is to be built. Knowledge of tools. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;4. What are the most important tools?&lt;/div&gt;&lt;div&gt;Python, Lisp, Emacs, SVN, MySQL, GIMP, Firefox, TextMate, Pine, Ruby, make, TeX, vi, Unix, sam, bash -- they all are extensible. Learn everything in /bin and /usr/bin on Unix. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My take on the articles is that there is something common among all great programmers -- modesty, persistence, self motivation, taste, and extensive knowledge of useful tools.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I suggest also read this &lt;a href="http://www.inter-sections.net/2007/11/13/how-to-recognise-a-good-programmer"&gt;great article&lt;/a&gt; on how to recognize great programmers. If you want to be successful as a programmer I also suggest reading &lt;a href="http://samizdat.mines.edu/howto/HowToBeAProgrammer.html"&gt;this book&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:Verdana, Arial, Helvetica, sans-serif;font-size:100%;"&gt;&lt;span class="Apple-style-span"  style="font-size:13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:Verdana, Arial, Helvetica, sans-serif;font-size:100%;"&gt;&lt;span class="Apple-style-span"  style="font-size:13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:Verdana, Arial, Helvetica, sans-serif;font-size:100%;"&gt;&lt;span class="Apple-style-span"  style="font-size:13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2913191424614156907?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2913191424614156907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2913191424614156907' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2913191424614156907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2913191424614156907'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/03/what-great-programmers-think.html' title='What Great Programmers think?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8604527026847171071</id><published>2010-02-20T13:52:00.007-05:00</published><updated>2010-02-20T14:30:07.560-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='ActionScript'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMFP'/><title type='text'>FAQ on using Flash Player to make phone calls</title><content type='html'>&lt;div&gt;I present my answers to some frequently asked questions (FAQ) on using Flash Player to make phone calls.&lt;/div&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;b&gt;1. Is Flash Application a good choice for VOIP?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;Depends, the RTMP based application is not a good choice, whereas new RTMFP application is good for Flash to Flash Internet voice applications. For Flash to Phone applications, Flash is not a good choice as it is. Flash is good at user interface and ubiquitous availability but the TCP-based RTMP is not suitable for real-time interactive media, and UDP-based RTMFP is proprietary so cannot interwork with existing SIP-based VoIP systems. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Secondly, Flash Player is missing some of the crucial VoIP pieces such as good silence suppression and echo cancellation,  so Flash based VoIP client becomes useless without a headset.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Thirdly, Although Flash Player supports open standard Speex audio codec, many existing VoIP providers do not support Speex, and expect only traditional voice codecs like G.729 and G.723.1. So you may also need to incorporate transcoding which is CPU intensive. Video transcoding is more difficult because of the proprietary video codec in Flash Player.&lt;br /&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;b&gt;2. Will there be any performance degradation when the call goes through the following paths? &lt;span class="Apple-style-span" style="font-weight: normal; "&gt;(Flash Client -&gt; Media Server -&gt;RTMP to SIP Converter -&gt; VOIP Server -&gt; VoIP/PSTN Gateway -&gt; PSTN Network -&gt; Telephone)&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="im"&gt;Yes. If you can avoid intermediaries to cut down on media path latency, it will help a lot. Typically the VoIP Server (or SIP proxy server) is independent of the media path so that doesn't affect. But the media path goes through Media Server (FMS?) and RTMP to SIP converter, and that too over TCP. This degrades the quality a lot. One way could be to remove the "Media Server" from your path by having Flash Client directly connect to the RTMP to SIP converter. Also if you can reduce the network distance between the Flash Client and RTMP to SIP Converter, that will help a lot.&lt;/div&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="im"&gt;Secondly, with Flash Player you may need to do audio transcoding in your RTMP to SIP converter. This further degrades the performance and limits the scalability of your converter.&lt;/div&gt; &lt;div class="im"&gt;&lt;br /&gt;&lt;b&gt;3. Some experts says that the development in C or C++ is prefered for VOIP call to phone instead of Flash Player for performance reason. Is that true?&lt;/b&gt;&lt;/div&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="im"&gt;A native VoIP client is preferred over Flash Player because the media packets can go directly from the client to the telephone instead of going through the RTMP to SIP converter. The advantage is because (1) the native client can use UDP instead of restricted to TCP-based RTMP, and (2) the network distance is lower for a direct path. Even if your converter is on good network and close to your client so that the network distance is not much of an issue, the UDP-vs-TCP makes a great impact in improving the quality of native VoIP client implementation over Flash Player.&lt;/div&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="im"&gt;In general the network component affects the quality more than the programming language. So whether you use C/C++, Python, Java or some other language, it doesn't matter much. But if you can have &lt;b&gt;end-to-end&lt;/b&gt; media path over &lt;b&gt;UDP&lt;/b&gt; between the two clients, or between the client and the gateway, it is much better. Obviously with Flash Player you cannot have the packets go directly unless your RTMP to SIP converter is local to the Flash Client.&lt;/div&gt;&lt;br /&gt;All the existing good quality systems (Skype, GTalk) tend to use end-to-end media-path over UDP as much as possible.&lt;br /&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;b&gt;4. There are different media servers available. like Adobe Flash Media server (FMS), Wowza, Red5 etc. Which one is the best choice?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;Do you still want to pursue RTMP to SIP converter? Anyways: In terms of performance I would &lt;i&gt;guess&lt;/i&gt; that FMS is the best choice. But if your aim to build a RTMP to SIP converter than probably Red5 is the the best.  FMS is proprietary with not much customization/programming choices available, so you cannot easily integrate a SIP stack or a RTMP to SIP converter to FMS. On the other hand Red5 is completely open source and in Java so allows easy integration with other Java based SIP stack. Additionally you could integrate SIP stacks written in other advanced languages such as Python or Ruby because Red5 allows applications in those languages, whereas an FMS application is restricted to ActionScript 1.0.&lt;br /&gt;&lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="im"&gt;I haven't worked with or used Wowza so I cannot comment on that. I have worked with FMS and Red5 though, as well as Python based &lt;a href="http://code.google.com/p/rtmplite"&gt;rtmplite&lt;/a&gt; and &lt;a href="http://code.google.com/p/siprtmp"&gt;siprtmp&lt;/a&gt; projects. &lt;/div&gt; &lt;div class="im"&gt;&lt;br /&gt;&lt;b&gt;6. We are now in a confusion whether to develop our VOIP application in Flash technology or QT/Java/C#. What will be your choice?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;I think that decision mostly comes from your business case. But I would suggest non-Flash technology if possible and if your business demands very good quality of voice service. If your VoIP client will be assisting your main business, then people won't mind downloading and installing the VoIP client. The advantage Flash has is that it is already available on most people's browser so doesn't require additional download or installation. So if your VoIP application is only a small part of your main web-based business, then Flash technology will be better I think.&lt;br /&gt;&lt;br /&gt;Another option is to use the Gmail video/voice architecture described in my &lt;a href="http://p2p-sip.blogspot.com/2009/06/how-does-google-video-chat-work-in.html"&gt;article&lt;/a&gt;. Basically it uses Flash Player for user interface, but all the networking or voice related processing happens using their native GoogleTalk plugin.&lt;div&gt; &lt;div class="im"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8604527026847171071?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8604527026847171071/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8604527026847171071' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8604527026847171071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8604527026847171071'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/02/faq-on-using-flash-player-to-make-phone.html' title='FAQ on using Flash Player to make phone calls'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-4077659397124928882</id><published>2010-01-07T19:34:00.007-05:00</published><updated>2010-04-14T21:38:55.581-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='39 Peers'/><category scheme='http://www.blogger.com/atom/ns#' term='idea'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Project ideas in P2P-SIP</title><content type='html'>This article lists some project ideas related to P2P-SIP. Feel free to contact me if you are a student interested in the project. For general networking and multimedia related project topics please visit &lt;a href="http://myprojectguide.org/"&gt;http://myprojectguide.org&lt;/a&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Survey of techniques for P2P VoIP: The project is a study project that surveys several existing P2P VoIP systems such as Skype, Gtalk, and several proposals in IETF P2P-SIP working group. The questions to answer are what are the best practices, what works, what doesn't, what has been implemented and tested, how do they compare with respect to scalability, robustness and security. Instead of comparing the whole systems, it is better to compare one function at a time, e.g., network maintenance, lookup, signup, etc. Clearly a critical and objective analysis is more useful than saying A is better than B.&lt;/li&gt;&lt;li&gt;Implement Reload: Implement the current version of IETF P2P-SIP WG internet-draft You may reuse the core p2p, dht and rfc3261 modules of the 39 peers project. The goal is to implement the core functions that allows incorporating external DHT algorithm, authentication as well as transport. I have a few suggestions to simplify the draft and hence the implementation. Language: Python.&lt;/li&gt;&lt;li&gt;P2P Simulator: Build a peer-to-peer network simulator in Python that implements various structured and unstructured algorithms. Related work includes p2psim, oversim and peersim. This project will build several modules to allow a developer to quickly put together an implementation of a P2P algorithm, view it graphically, and see the impact of various parameters. You should be able to support nodes behind NAT and firewall in your simulator, incorporate security and analyze the performance. Language: Python&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-4077659397124928882?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/4077659397124928882/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=4077659397124928882' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4077659397124928882'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4077659397124928882'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2010/01/project-ideas-in-p2p-sip.html' title='Project ideas in P2P-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-1191233159993821848</id><published>2009-12-21T17:00:00.009-05:00</published><updated>2009-12-24T17:59:29.599-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Comparison'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>Software API Design</title><content type='html'>In this article I present my view on the design of software APIs. Several other people have written extensively about the importance of API design and best practices, mostly in the Java community, but still I find so many poorly designed software APIs. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What is a Good API?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;At the high level only a few design principles: &lt;i&gt;complete&lt;/i&gt;, &lt;i&gt;easy-to-use&lt;/i&gt; and &lt;i&gt;minimum&lt;/i&gt;, are enough to design good API. The API should be complete, without any missing element, to achieve all the necessary functions of the software. At the same time, the API should be minimum, without much redundancy. For example, if the same thing can be done using two method, pick the best, and use that. Most importantly the API should be easy to use. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Consider the Unix socket API. It provides the core functions such as connect, sendto, bind, using explicit methods whereas additional rarely used functions are supported using setsockopt or ioctl. The best feature of the API is that it treats a socket similar to a file descriptor, so you could indeed use the file related functions on a socket.&lt;br /&gt;&lt;pre&gt;s = socket(...);&lt;br /&gt;write(s, ...);    /* use it like a file */&lt;br /&gt;&lt;/pre&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Asynchrounous I/O&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The annoying part in Unix socket is the select and poll set of function for doing asynchronous I/O. The primary reason is that there is no standardized way for asynchronous API in Unix. On the other hand Windows invented the WSAAsyncSelect function which is more difficult to use. An event-based API for asynchronous I/O is clean and easy to use, provided the event is dispatched by the relevant object.&lt;br /&gt;&lt;pre&gt;s = new Socket();&lt;br /&gt;s.addEventListener("socketData", handlerFunction);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;b&gt;Return vs exception&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;Exception is better for indicating failure cases, if the failure is less likely to happen. On the other hand, return value of failure is suitable if the failure is part of the result. For example, a findUser() function can return the User object on success, and null on failure such as not-found. What if there is other kind of failure? Should it return null or throw an exception?&lt;br /&gt;&lt;br /&gt;Exceptions make more cleaner code, at least in Python. This avoids several if-else constructs to check the return values, and you can accumulate all error handling at one place, either in this function or in the caller stack frame.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Value vs reference&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;You would have studied pass-by-value and pass-by-reference in your programming class. This topic goes beyond that. For example, if a Camera object represents a default-camera for your machine, should the Camera object automatically switch to the new camera if the user changes the default camera from the control panel independent of your software?&lt;br /&gt;&lt;pre&gt;cam = Camera.getCamera("default")&lt;br /&gt;# will cam be the current snapshot of default-camera or will it automatically&lt;br /&gt;# change when default-camera changes?&lt;br /&gt;&lt;/pre&gt;In your software, if you need to represent the local logged in user as a LocalUser object, should you create a new object when the login changes or should the same object update its state to reflect the new logged in user? If you use a local structure to represent the local listening IP address of your network application, should this structure automatically change its IP value whenever your machine's IP address changes?&lt;br /&gt;&lt;br /&gt;The value object is easy to implement and understand. However, if an application needs to detect any change in the value they need periodic polling or event dispatching on change. The reference object is more convenient and clean to the application developer, but needs additional work in the implementation of the API. One option is to have generic reference wrapper, which can be used to convert any value object to reference object, as long as the change detection is uniform.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Synchronous vs non-blocking&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;Synchronous APIs are easy to understand and use whereas non-blocking event-based APIs are difficult to use. Non-blocking calls have advantage that your thread is not held-up waiting for the response. This is particularly a problem for single threaded software systems. In some platforms, e.g., ActionScript on (single threaded) Flash Player, there is no choice, and you must do non-blocking calls. Python provides constructs such as yield, which allows you to write synchronous co-operative multitasking software. Thus, even though your application code looks like synchronous, it actually yields to other task behind the scene. Clearly critical section and shared resources must be protected appropriately for read and write access as needed.&lt;br /&gt;&lt;pre&gt;task1: data = yield multitask.recv(sock, timeout=10)&lt;br /&gt;task2: yield multitask.send(sock, somedata)&lt;/pre&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Generic vs specific&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While most API designers tend to write methods that are specific for one task, the arguments should be generic. In C++ or Java, templates allow you to specify generic algorithm. The question on generic vs specific spans more use cases: should your supply a String as your method name, or should you define an Enum and use that? If you already have a function getUser(String name), should you define another function getUserById(int id) or should you overload using argument type?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Generic API is more extensible but care must be take to handle various edge cases and throw appropriate exceptions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Document and testing&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;How should the API be specified? Should there be a set of documents that describe each and every function in various use cases, or should there be built-in test cases that show how to use the API? or both? With document alone, how do you make sure that the document is updated every time someone updates the implementation? Python's doctest module allows you to integrate the unit testing in the API document itself. This cleanly makes sure that your unit test will fail if your API implementation changed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Analogy&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The single feature which improves the usability of an API is using an analogy with some existing thing. For example, a web interface modeled after Unix file I/O is very easy to understand and use. On the other hand, a completely new paradigm for your API will make it hard to understand. Other examples of existing paradigms are the listener-provider, event dispatcher, property getter setter, read-only vs read-write property, attribute vs container access, dictionary or hash map collection. For example, if you want to implement a new P2P storage module, consider exposing it as a dictionary where user can put and get values using keys, and can replace his existing code that uses local dict or HashMap with your new distribute dictionary.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Software APIs tend to last longer than expected. Careful thoughts in the design process can help your API withstand the test of time. When you design an API, weigh your options with respect to these properties: synchronous vs asynchronous, return vs exception, value vs reference, generic vs specific, and incorporate analogy with existing paradigm, and automatic documentation and testing tools.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-1191233159993821848?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/1191233159993821848/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=1191233159993821848' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1191233159993821848'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1191233159993821848'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/12/software-api-design.html' title='Software API Design'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-5185662163878399361</id><published>2009-12-09T15:03:00.003-05:00</published><updated>2009-12-24T16:51:36.778-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Lessons'/><category scheme='http://www.blogger.com/atom/ns#' term='Systems'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='Inspiration'/><title type='text'>Systems Software Research</title><content type='html'>A very interesting talk by Rob Pike on &lt;a href="http://herpolhode.com/rob/utah2000.pdf"&gt;Systems Software Research is Irrelevant&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Some quotes from the slides (by Rob Pike)&lt;/span&gt;:&lt;br /&gt;&lt;br /&gt;"We see a thriving software industry that largely ignored research, and a research community that writes papers rather than software".&lt;br /&gt;&lt;br /&gt;"Java is to C++ as Windows is to Machintosh: an industrial response to an interesting but technically flawed piece of systems software."&lt;br /&gt;&lt;br /&gt;"Linux's cleverness is not in the software, but in the development model, hardly a triumph of academic CS (software engineering) by any measure."&lt;br /&gt;&lt;br /&gt;"It (systems research) is just a lot of measurement: a misinterpretation and misapplication of the scientific method. Invention has been replaced by observation."&lt;br /&gt;&lt;br /&gt;"If it didn't run on a PC, it didn't matter because the average, mean, median, and mode computer was a PC."&lt;br /&gt;&lt;br /&gt;"To be a viable computer system, one must honor a huge list of large, and often changing, standards: TCP/IP, HTTP, HTML, XML, CORBA, Unicode, POSIX, NFS, SMB, MIME, POP, IMAP, X, ... With so many externally imposed structure, there is little left for novelty."&lt;br /&gt;&lt;br /&gt;"Commercial companies that 'own' standards deliberately make standards hard to comply with, to frustrate competition. Academic is a casualty."&lt;br /&gt;&lt;br /&gt;"New employees in our lab now bring their world (Unix, X, Emacs, Tex) with them, or expect it to be there when they arrive... Narrowness of experience leads to narrowness of imagination."&lt;br /&gt;&lt;br /&gt;"In science, we reserve our highest honors for those who prove we were wrong. But in computer science..."&lt;br /&gt;&lt;br /&gt;"How can operating systems research be relevant when the resulting operating systems are all indistinguishable? (Unix is) a victim of its own success: portability led to ubiquity. That meant architecture didn't matter, so there's only one."&lt;br /&gt;&lt;br /&gt;"Government funded and corporate research is directed at very fast 'return on investment'... The metric of merit is wrong."&lt;br /&gt;&lt;br /&gt;"Measure success by ideas, not just papers and money. Make the industry want your work."&lt;br /&gt;&lt;br /&gt;"The future is distributed computation, but the language community has done very little to address that possibility."&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;My take on the lessons learned, again in the form of quotes&lt;/span&gt;:&lt;br /&gt;&lt;br /&gt;"Keep the ideas flowing, even if the implementation is not feasible (using existing systems)."&lt;br /&gt;&lt;br /&gt;"When thinking of distributed systems -- think beyond web, Browser and Flash Player"&lt;br /&gt;&lt;br /&gt;"Something is popular, does not mean it is correct or best way to do that thing."&lt;br /&gt;&lt;br /&gt;"Do not publish papers that fake measurement as research."&lt;br /&gt;&lt;br /&gt;"Do not take a job that you are not truly motivated about."&lt;br /&gt;&lt;br /&gt;"Writing software in Java is like writing detailed machine instructions. Learn Python instead."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-5185662163878399361?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/5185662163878399361/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=5185662163878399361' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5185662163878399361'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5185662163878399361'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/12/systems-software-research.html' title='Systems Software Research'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7013672250047972132</id><published>2009-12-07T19:09:00.004-05:00</published><updated>2009-12-24T16:50:07.981-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RESTful'/><category scheme='http://www.blogger.com/atom/ns#' term='ActionScript'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Flex'/><title type='text'>REST, Flash Player, Flex</title><content type='html'>In doing some experiments with Flex and RESTful architecture, I realized that there seems to be a whole lot of problems. Flash Player was designed to support traditional HTTP web access such as form posting or resource retrieval using GET and POST. So a number of features that are used in RESTful design are not supported by Flash Player. People have written additional client-side or on server-side kludges to work around the problems with HTTP support in Flash Player. Most of the server side changes are hacks, and the client-side changes are in external third-party libraries, which are sometimes missing crucial features like cookies, TLS, etc. Even in the best possible scenario, you still need to provide crossdomain policy file even if the Flash application is accessing resources on the same server. One of the reasons I suspect is that Flash Player relies on the browser for HTTP support, and hence supports only the lowest common set of features, which are not enough for REST architecture.&lt;br /&gt;&lt;br /&gt;What is the solution? Depends on what you want to do.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;If you have control over server-side of the system, you need to incorporate certain kludges to support Flash restrictions. For example, provide crossdomain policy-file, map some header or URL to method PUT and DELETE, return 200 success response with message body containing actual error code (e.g., 404 vs 405 vs 501) or headers. Any of these techniques looks like a hack at best.&lt;/li&gt;&lt;li&gt;If you have control over the client, you can use an existing third-party RESTful client library in ActionScript. However, be prepared to provide a crossdomain policy-file or incorporate a proxy. Alternatively, you can also perform some Flash Player-specific translations on the fly in your proxy.&lt;/li&gt;&lt;li&gt;Use Flash Player's ExternalInterface mechanism and incorporate your RESTful client code in JavaScript. This is sometimes not easy, error-free or feasible. Moreover, now your Flash application depends on your Javascript.&lt;/li&gt;&lt;/ol&gt;What are the problems with these solutions? &lt;div&gt;&lt;ol&gt;&lt;li&gt;You cannot build a general purpose RESTful web application (server-side) and expect Flash Player application to use them. You will need to be Flash Player-aware. &lt;/li&gt;&lt;li&gt;You cannot build a general purpose Flash application (client-side) to use an existing RESTful web service without additional dependencies and network elements (proxies).&lt;/li&gt;&lt;/ol&gt;What is the real solution?&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;It looks like HTTP+REST won't really be able to solve all the problems in ActionScript for Flash Player without Adobe's blessings, e.g., incorporate full HTTP stack in the Flash Player or provide a different way other than crossdomain to do authentication of scripts.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;Some additional references [&lt;a href="http://www.fngtps.com/2007/06/flex-can-t-do-rest"&gt;1&lt;/a&gt;, &lt;a href="http://code.google.com/p/resthttpservice"&gt;2&lt;/a&gt;, &lt;a href="http://stackoverflow.com/questions/153420/is-it-feasible-to-create-a-rest-client-with-flex"&gt;3&lt;/a&gt;, &lt;a href="http://cookbooks.adobe.com/post_Flex_REST_client-14106.html"&gt;4&lt;/a&gt;]&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7013672250047972132?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7013672250047972132/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7013672250047972132' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7013672250047972132'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7013672250047972132'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/12/rest-flash-player-flex.html' title='REST, Flash Player, Flex'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7618697565282252451</id><published>2009-11-30T18:18:00.010-05:00</published><updated>2009-12-07T13:05:22.279-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='Conferencing'/><category scheme='http://www.blogger.com/atom/ns#' term='RESTful'/><category scheme='http://www.blogger.com/atom/ns#' term='videocity'/><category scheme='http://www.blogger.com/atom/ns#' term='restlite'/><category scheme='http://www.blogger.com/atom/ns#' term='REST'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>REST and SIP</title><content type='html'>This article describes a RESTful SIP application server architecture.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why do we need this?&lt;/b&gt;&lt;br /&gt;SIP is the protocol of choice for Internet session initiation and control such as for VoIP or multimedia calls. Although SIP is similar to HTTP in many respects, there are crucial differences in the design. Two of the major difficulties among web developers in adopting SIP are (1) no existing SIP-based web tools similar to programming libraries for HTTP and XMPP on Flash Player, (2) the initial cost to get started with basic working system is huge with lot of specifications, e.g., for NAT and firewall traversal. On the other hand, web developers are used to building applications on top of HTTP which works for most cases out of the box. More recently RESTful architectures are gaining popularity among web services. In the absence of easy to use web tools for SIP and large set of specifications for a SIP system, web developers tend to resort to quick and dirty hacks which in the end are short term and not interoperable. Hence there is a need for a easy to use RESTful architecture for SIP-based systems that allows quick application development by web developers. This article proposes such an architecture.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What exactly is difficult?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;SIP supports both UDP and TCP transports. Many earlier systems implemented UDP, whereas both transports are a must for SIP proxy servers. In client-server communication, with several clients behind NAT and firewall, UDP causes problem. Secondly, with UDP you also need the reliability of transactions and hence the transaction state machines in SIP. The SIP request forking and early media feature have created lot of stir and confusion among developers. Several other telephony-style features are also not needed for many Internet oriented SIP applications that do not talk to a phone network. The NAT and firewall traversal are defined outside core SIP, e.g., using rport, sip-outbound. A developer usually prefers to have an integrated application library and API that is quick and easy to use. Moreover with lots of RFCs related to SIP, it becomes difficult to figure out what specifications are core and what are optional for a particular use case. A number of new web-based video communication systems use proprietary technologies such as on Flash Player because of lack of a ready-to-use SIP library to satisfy the needs.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To solve the difficulties faced by web developers, a subset of the core features of SIP are needed as an easy to use API. Such an API could be available as a built-in browser feature or a plugin. Once the core set of resources are identified, rest of the API can be customized by the application server providers and developers, or in separate communities.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What use cases are considered?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;SIP is designed to be used consistently in different use cases such as client-to-client communication, client-to-server as well as server-to-server. The core SIP says that each SIP user agent (application client) has both UAC (client) and UAS (server). In this article I refer to &lt;span style="font-style: italic;"&gt;client&lt;/span&gt; as a user agent and &lt;span style="font-style: italic;"&gt;server&lt;/span&gt; as an application server, which are different from SIP terminology.  Since the target audience for the proposal is application developers, only the client-server interface needs to be considered. The backend application server can translate the client-server request to appropriate SIP messaging for server-to-server case if needed, e.g., for service provider's network you may need high performance UDP based server-to-server SIP messages. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;What are the SIP-related resources?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Once we focus on a small subset of the problem -- define RESTful API for client-server communication to access a SIP application server -- rest of the solution falls in place naturally. In particular, the SIP application server will provide two core resources: "/login" and "/call" to represent list of currently logged in users and list of active calls. Additionally, it can provide user profiles of signed up users at "/user" which internally may contain things like voicemail resources for the user. The client uses standard HTTP requests, with some additional methods as shown below, to access the resources and interact with others. One difference with standard RESTful architecture is that the client-server connection may be long lived, and also used for notification from server to client. In that sense it does not remain pure RESTful.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;span style="font-weight: bold;"&gt;Login&lt;/span&gt;: The SIP registration and unregistration are mapped to "/login/{email}" resource, e.g., "/login/kundan@example.net". Doing a "POST /login/{email}" with message body containing your contacts, can be used to REGISTER. The response will return your unique identifier for the login resource, e.g., "/login/{email}/{contact_id}. Later, you can use "DELETE /login/{email}/{contact_id}" to un-REGISTER or a subsequent "PUT /login/{email}/{contact_id}" to do a REGISTER refresh. The actual representation of the login contact information can be in XML, JSON or plain text and is application dependent. For example one could combine the presence update including rich presence with the registration method. Clearly the login update requires appropriate authentication.&lt;br /&gt;&lt;pre&gt; POST /login/kundan@example.net      -- new registration&lt;br /&gt;request-body: {"contact": "sip:kundan@192.1.2.3:5062"}&lt;br /&gt;response-body: {"url": "/login/kundan@example.net", "id": 1, "expires": 3600}&lt;br /&gt;&lt;br /&gt;PUT /login/kundan@example.net/1     -- registration refresh&lt;br /&gt;request-body: "sip:kundan@192.1.2.3:5062"&lt;br /&gt;&lt;br /&gt;DELETE /login/kundan@example.net/1  -- unregister&lt;br /&gt;&lt;br /&gt;GET /login/kundan@example.net       -- get list of contact locations&lt;br /&gt;response-body: [{"id": 1, "contact": "sip:kundan@192.1.2.3:5062", ...},...]&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-weight: bold;"&gt;Call&lt;/span&gt;: The call is split into two part: conference resource and invitation. The conference is represented using a "/call/{call_id}" resource, where a client can "POST /call" to create a new call identifier, or "POST /call/{call_id}" to join an existing call. The conference resource represents the list of participants in a call.&lt;br /&gt;&lt;pre&gt; POST /call             -- create a new call context&lt;br /&gt;request-body: {"subject": "some discussion topic", ...}&lt;br /&gt;response-body: {"id": "123", "url": "/call/123" }&lt;br /&gt;&lt;br /&gt;POST /call/123         -- join a call&lt;br /&gt;request-body: {"url": "/login/kundan@example.net", "session": "rtsp://...", ...}&lt;br /&gt;response-body: {"id": 2, "url": "/call/123/2", ...}&lt;br /&gt;&lt;br /&gt;GET /call/123          -- get participant list and call info&lt;br /&gt;response-body: {"subject": "some discussion topic",&lt;br /&gt;                "children": [{"url": "/call/123/2", "session": "rtsp://..."}]&lt;br /&gt;               }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Invite&lt;/span&gt;: Call invitation requires a new message such as "SEND". For example, "SEND /login/{email}" sends the given message body to the target logged in user. Similarly, "CANCEL /login/{email}/1" cancels a previously sent message it is not already sent. The message body gives additional details such as whether the message is a call invitation or an instant message. The message body is application dependent. The SIP application server does not need to understand the message body, as long as it can send a SEND message from one client to another. This makes a SEND more closer to an XMPP &lt;message&gt; instead of a SIP INVITE. If the callee wants to accept the call invitation, it joins the particular session URL independently.&lt;br /&gt;&lt;pre&gt; SEND /login/alok@example.net     -- send call invitation&lt;br /&gt;request-body: {"command": "invite", "url": "/call/123", "id": 567}&lt;br /&gt;&lt;br /&gt;SEND /login/alok@example.net     -- cancel an invitation&lt;br /&gt;request-body: {"command": "cancel", "url": "/call/123", "id": 567}&lt;br /&gt;&lt;br /&gt;SEND /login/kundan@example.net   -- sending a response&lt;br /&gt; request-body: {"command": "reject", "url": "/call/123", "id": 567, "reason": ...}&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-weight: bold;"&gt;Event&lt;/span&gt;: SIP includes an event subscription and notification mechanism which can be used in several applications including presence updates and conference membership updates. Similarly, one needs to define new mechanism to subscribe to any resource and get notification of a change. This gives rise to a concept known as active-resource. The idea is as follows: if a client does a GET on active resource, and does not terminate the connection, then the client keeps getting the initial state of the resource, as well as any future updates until the connection is terminated. The future updates may include the full state or a difference depending on the request parameter.&lt;br /&gt;&lt;pre&gt; GET /call/123          -- keep track of membership information&lt;br /&gt;response 1:  ...      -- initial membership information&lt;br /&gt;response 2:  ...      -- any addition or deletion in the membership&lt;br /&gt;&lt;br /&gt;GET /login/kundan@example.net -- keep track of presence updates&lt;br /&gt;response 1:  ...      -- initial presence information&lt;br /&gt;response 2:  ...      -- subsequent presence updates.&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-weight: bold;"&gt;Profile and messages&lt;/span&gt;: The SIP application server will host user profile at "/user/{user_id}". The concept of user identifier will be implementation dependent. In particular, the client could "POST /user" to create a new user account, and get the identifier in the response. It can then do a "GET /user/{user_id}" to know various URLs to get contact location of this user. It can then do a GET on that URL to fetch the contacts or do a SEND on that URL to send a message or call invitation.&lt;br /&gt;&lt;pre&gt; POST /user                            -- signup with a new account&lt;br /&gt;request-body: {"email": "kundan@example.net", ...}&lt;br /&gt;response-body: {"id": "kundan@example.net", "url": "/user/kundan@example.net" }&lt;br /&gt;&lt;br /&gt;POST /user/kundan@example.net/message  -- send offline messages (voice/video mail)&lt;br /&gt; request-body: {"url": "rtsp://..."}&lt;br /&gt;&lt;br /&gt;GET /user/kundan@example.net/message   -- retrieve list of messages&lt;br /&gt; response-body: [{"url": "rtsp://...", ...]&lt;br /&gt;&lt;/pre&gt;&lt;span style="font-weight: bold;"&gt;Miscelleneous&lt;/span&gt;: There are several other design questions that are left unanswered in the above text. Most of these can be intuitively answered. For example, the HTTP authentication credential defines the sender of a message, i.e., SIP "From" header. The sequential or parallel forking is a decision left to the client application. The decision whether to use a SDP or XML-based session description is application and implementation dependent. For example, if the client is creating a conference on RTSP server, it will just send the RTSP URL in the call invitations. Similarly, for Flash Player conferencing it will send an RTMP URL in the call invitation. The call property such as participant's session description can be fetched by accessing the call resource on the server. Thus, whether an RTSP/RTMP server is used to host a conference or a multicast address is used is all client or application dependent. The application server will provide tools to allow such freedom.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Conclusion&lt;/span&gt;: A RESTful interface to SIP application server is an interesting idea described in this article. The idea looks feasible and doable using existing software and tools, and hopefully will benefit both the web developer and SIP community in getting wider usage of SIP systems. The goal is not to replace SIP, but to provide a new mechanism that allows web-centric applications to use services of a SIP application server and to allow building such easy to use SIP application server.&lt;br /&gt;&lt;br /&gt;Several of the pieces described in this article are already implemented in Python, e.g., &lt;a href="http://code.google.com/p/restlite"&gt;RESTful server tools&lt;/a&gt;, &lt;a href="http://code.google.com/p/videocity"&gt;video conferencing application server&lt;/a&gt;, &lt;a href="http://code.google.com/p/siprtmp"&gt;SIP-RTMP translation&lt;/a&gt; and &lt;a href="http://code.google.com/p/p2p-sip"&gt;SIP server and client library&lt;/a&gt;. The  next step would be to combine these pieces to build a complete &lt;a href="http://myprojectguide.org/node/6#comment-43"&gt;REST and SIP project&lt;/a&gt;. If you are interested in doing the project feel free to get in touch with me!&lt;br /&gt;&lt;/message&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7618697565282252451?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7618697565282252451/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7618697565282252451' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7618697565282252451'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7618697565282252451'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/11/rest-and-sip.html' title='REST and SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-1738704663453564844</id><published>2009-11-30T18:17:00.002-05:00</published><updated>2009-12-24T17:01:22.195-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RESTful'/><category scheme='http://www.blogger.com/atom/ns#' term='restlite'/><category scheme='http://www.blogger.com/atom/ns#' term='REST'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>REST, RESTful and restlite</title><content type='html'>This post announces a new open source software: &lt;a href="http://code.google.com/p/restlite/"&gt;http://code.google.com/p/restlite/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;What is restlite? &lt;/b&gt;Restlite is a light-weight Python implementation of server tools for quick prototyping of your RESTful web service. Instead of building a complex framework, it aims at providing functions and classes that allows your to build your own application.&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span"  style="color:#000099;"&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;restlite = REST + Python + JSON + XML + SQLite + authentication&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Features&lt;/b&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Very lightweight module with single file in pure Python and no other dependencies hence ideal for quick prototyping.&lt;/li&gt;&lt;li&gt;Two levels of API: one is not intrusive (for low level WSGI) and other is intrusive (for high level @resource).&lt;/li&gt;&lt;li&gt;High level API can conveniently use sqlite3 database for resource storage.&lt;/li&gt;&lt;li&gt;Common list and tuple-based representation that is converted to JSON and/or XML.&lt;/li&gt;&lt;li&gt;Supports pure REST as well as allows browser and Flash Player access (with GET, POST only).&lt;/li&gt;&lt;li&gt;Integrates unit testing using doctest module.&lt;/li&gt;&lt;li&gt;Handles HTTP cookies and authentication.&lt;/li&gt;&lt;li&gt;Integrates well with WSGI compliant applications.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;b&gt;Motivation: &lt;/b&gt;As you may have noticed, the software provides tools such as (1) regular expression based request matching and dispatching WSGI compliant router, (2) high-level resource representation using a decorator and variable binding, (3) functions for converting from unified list representation to JSON and XML, and (3) data model and authentication classes. These tools can be used independent of each other. For example, you just need the router function to implement RESTful web services. If you also want to do high-level definitions of your resources you can use the @resource decorator, or bind functions to convert your function or object to WSGI compliant application that can be given to the router. You can return any representation from your application. However, if you want to support multiple consistent representations of XML and JSON, you can use the represent function of request.response method to do so. Finally, you can have any data model you like, but implementations of common SQL style data model and HTTP basic and cookie based authentication are provided for you to use if needed.&lt;br /&gt;&lt;br /&gt;This software is provided with a hope to help you quickly realize RESTful services in your application without having to deal with the burden of large and complex frameworks. Any feedback is appreciated. If you have trouble using the software or want to learn more on how to use, feel free to send me a note!&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-1738704663453564844?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/1738704663453564844/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=1738704663453564844' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1738704663453564844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1738704663453564844'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/11/rest-restful-and-restlite.html' title='REST, RESTful and restlite'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7016080915170369734</id><published>2009-11-10T13:40:00.020-05:00</published><updated>2009-11-24T03:29:35.028-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='RTP'/><category scheme='http://www.blogger.com/atom/ns#' term='ICE'/><category scheme='http://www.blogger.com/atom/ns#' term='IAX'/><category scheme='http://www.blogger.com/atom/ns#' term='H.323'/><category scheme='http://www.blogger.com/atom/ns#' term='Skype'/><category scheme='http://www.blogger.com/atom/ns#' term='XMPP'/><category scheme='http://www.blogger.com/atom/ns#' term='Protocols'/><title type='text'>Protocol Jungle of Internet multimedia communication</title><content type='html'>The diagram shows several protocols for Internet multimedia communication. (Click on the diagram to see the full size picture.) In the protocol jungle, a protocol is analogous to a species, its real-world implementation or deployment is an animal of the species. Some animals or species compete with each other for survival. Some animals live with each other in harmony. Some animals do not care or interact with each other since they live in different place, i.e., application or domain. Evolution and mutation results in long lasting survival of some species whereas others become extinct. Unlike using a protocol zoo metaphor, I use a protocol jungle, because there is really a competition between protocols when big companies have invested in certain protocol unlike a closely guarded and nurtured zoo system.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_j-OZz2I3T5A/SwGfEUEXuxI/AAAAAAAAAA4/cmEDlr_AZkI/s1600/protocols.gif"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://4.bp.blogspot.com/_j-OZz2I3T5A/SwGfEUEXuxI/AAAAAAAAAA4/cmEDlr_AZkI/s200/protocols.gif" border="0" alt="" id="BLOGGER_PHOTO_ID_5404775924276640530" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The diagram shows the species and its relationship with other species, e.g., whether A uses B or whether A and B are friendly. Due to space constraint, some items are grouped together, e.g., all the audio/video codecs, and some relationships are missing, e.g., RTMP is friendly with Speex. Ideally, we need a multi-dimensional representation to show multiple aspects of the jungle and how they are related. The following text lists the protocols that serve similar or common functions, and usually are competing within that function.&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Function&lt;/td&gt;&lt;td&gt;Protocols&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Structured data encoding&lt;/td&gt;&lt;td&gt;XML, ASN.1, RFC822, others&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Audio encoding&lt;/td&gt;&lt;td&gt;G.711, G.723.1, G.722, G.726, G.728, G.729, MP3, Speex, Nellymoser, AMR, Silk, GIPS, etc.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Video encoding&lt;/td&gt;&lt;td&gt;H.261, H.263, H.264, MPEG, Sorenson, Vidyo, etc.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Media transport&lt;/td&gt;&lt;td&gt;RTP/RTCP, SRTP, ZRTP, Skype, IAX, RTMP, RTMFP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Rendezvous&lt;/td&gt;&lt;td&gt;SIP, H.323, Skype, Stratus/RTMFP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Session description&lt;/td&gt;&lt;td&gt;SDP, H.245, Jingle&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Session negotiation&lt;/td&gt;&lt;td&gt;SIP/SDP, H.245, Jingle, Skype, RTMFP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Call signaling and control&lt;/td&gt;&lt;td&gt;SIP, H.225/Q.931, Skype, IAX, MGCP, SCCP (Skinny), RTMFP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Streaming media control&lt;/td&gt;&lt;td&gt;RTSP, RTMP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Session announcement&lt;/td&gt;&lt;td&gt;SAP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Connectivity&lt;/td&gt;&lt;td&gt;ICE/STUN/TURN, Skype, RTMFP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Remote Procedure Call&lt;/td&gt;&lt;td&gt;SOAP, XMLRPC, REST, RTMP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Programming calls&lt;/td&gt;&lt;td&gt;CGI, CPL, CCXML, MSCML&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Programming voice dialog&lt;/td&gt;&lt;td&gt;VoiceXML&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Instant messaging&lt;/td&gt;&lt;td&gt;XMPP, SIMPLE, MSRP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Presence&lt;/td&gt;&lt;td&gt;XMPP, SIMPLE&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Shared resource access&lt;/td&gt;&lt;td&gt;REST, XMPP, XCAP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Shared state&lt;/td&gt;&lt;td&gt;XMPP, RTMP, HTTP&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;As you can see that a SIP system typically employs one protocol for one task or a few related tasks, but integrated monolithic systems such as those based on RTMP/RTMFP, Skype or IAX tend to combine multiple functions in the single protocol. I have not listed H.32x protocols other than H.323 because those are intended for non-IP networks. Nevertheless, there are several H.32x systems, e.g., for room based video conferencing or for carrying voice among carriers.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Interworking&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;With multiple protocols available for the same function, interoperability or interworking among those becomes important. I have talked about SIP and XMPP interworking in the last post. I have hands-on experience with several of the interworking scenarios among protocols shown in the diagram. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;H.323-H.324: One of my projects in my first job was interworking between H.323 and H.324. Since both these systems use H.245 as the main session description and negotiation, the interworking task is relatively simple. I also worked on part of H.320 system to try to build H.323-H.320 interworking, but did not complete.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-H.323: One of my first project during my M.S. at Columbia University was SIP-H.323 interworking. I have written &lt;a href="http://www1.cs.columbia.edu/irt/cinema/doc/siph323.html"&gt;sip323&lt;/a&gt; software and couple of internet drafts and papers [1] on this. My PhD thesis gives a complete interworking procedure for basic call setup and registration. The conclusion was that while basic call setup and registration are easy to interwork, the full interworking of all the supplementary services is not feasible and not even needed in many cases. Since both SIP and H.323 use RTP/RTCP for media transport and can use the same set of codecs, the signaling gateway is efficient. The company SIPquest which productized my software demonstrated 10k simultaneous calls (&lt;a href="http://www.tmcnet.com/usubmit/2004/Feb/1024565.htm"&gt;this article&lt;/a&gt;). &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-RTSP: These protocols serve different purposes, but it is possible to build a system that needs both these functions in a standard compliant way. The &lt;a href="http://www.cs.columbia.edu/irt/cinema/doc/sipum.html"&gt;sipum&lt;/a&gt; software is a voice mail and answering machine that uses SIP for calls and RTSP for recording and playback of media. Since both these use RTP/RTCP for media transport and can use the same set of codecs, the software is efficient as the media path can bypass the software. Please see my papers [1] for details.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-RTMP: There have been several attempts at implementing Flash based SIP systems and SIP-RTMP translator is one of the approach. Some existing projects that implement these are &lt;a href="http://code.google.com/p/siprtmp/"&gt;siprtmp&lt;/a&gt;, &lt;a href="http://www.gtalk2voip.com/sipper/"&gt;gtalk2voip&lt;/a&gt;, &lt;a href="http://code.google.com/p/red5phone/"&gt;red5phone&lt;/a&gt; and &lt;a href="http://www.flaphone.com/"&gt;flaphone&lt;/a&gt;. Since RTMP is an integrated streaming protocol which can also do control and RPC, the translator is inefficient since it needs to incorporate the media path as well.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-Skype: Being a proprietary protocol, it is not easy to interwork with Skype. However, Skype itself uses SIP to allow trunking with PSTN providers, and recently there was some news about SIP-based Skype gateway for enterprise.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-IAX: Although IAX is open, it is an integrated protocol that combines media and signaling in the same connection, hence suffers from the same scalability problem as other integrated protocols like RTMP. Asterix also has a SIP gateway so that it can talk to SIP-enabled devices, especially carrier equipments.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-XMPP: There is a &lt;a href="http://mail.jabber.org/mailman/listinfo/sip-xmpp"&gt;interest group&lt;/a&gt; that discusses this in depth. My last post gives more links about the interworking scenarios using a gateway or co-location in the client.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-RTMFP: Given the P2P promise of RTMFP, a gateway between these two protocols will be able to connect the proprietary Adobe protocol with the rest of the world for a true web-based end-to-end media path. I haven't seen any system that does this.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;SIP-H.320: This gateway is particularly useful for existing room based video conferencing systems that want to connect with more Internet-style SIP devices. The idea is similar to SIP-H.323 translator, and in fact a real deployment may use two gateways: SIP-H.323 and H.323-H.320 in practice.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;RTMP-XMPP: Since RTMP and XMPP serve two completely different functions, there is no need to interoperate. However, people have built systems that use XMPP for messaging and signaling while using RTMP for media path. Unfortunately since Jingle extension wants to define its own end-to-end session, it becomes not so useful for exchanging RTMP server session information. In particular use XMPP custom extensions based on presence and message to rendezvous, but do session control and call management in RTMP itself.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;XMPP-SIMPLE: The SIP-XMPP interest group is also looking at SIMPLE-XMPP translation. However, given the disconnect between the two protocols, it is likely that all the presence and message updates go through the gateway and hence not as efficient as one would want for presence and instant messages. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;RTMP-Skype: Now this is going to be really tough because firstly Skype is still a proprietary protocol, and secondly, both these are integrated protocols hence requiring complete conversion of signaling and media. An specific example could be allowing people to access Skype from web pages, e.g., by having a simple RTMP server in the Skype application itself. This works if Skype is running on your local computer. Alternatively, you need the Flash application to connect to Skype super-nodes running on public computers using RTMP. This poses security risk and is inefficient. Why inefficient? because RTMP over TCP means that only the applications on public Internet will be able to receive the connection, and RTMP is not really good for real-time interactive communication because of its latency and buffering. However, if such gateway are incorporated in Skype, then it truly become ubiquitous to web applications.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;RTMP-RTSP: These are two competing streaming protocols. Instead of having a gateway that translates between the two protocols, it might be better to build an integrated client or integrated server -- you can record using RTMP and view using Quicktime (RTSP), or you can use the same client to access real-time streams from RTMP or RTSP. Since RTMP incorporates RPC along with streaming control and media path, whether as RTSP is just streaming control, a complete translation of all the functions may not be feasible.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;ASN.1-XML: There has been effort to standardize this, e.g., XER. The proposed H.325 standard by ITU-T will use XML while allowing compatibility with some of the predecessors which are in ASN.1 PER. ASN.1 and XML are just data formats and for the purpose of P2P-SIP, they are not very significant.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you have data about the usage in real deployment for particular protocol(s), feel free to post your comment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[1] My publication page &lt;a href="http://kundansingh.com/#papers"&gt;http://kundansingh.com/#papers&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7016080915170369734?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7016080915170369734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7016080915170369734' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7016080915170369734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7016080915170369734'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/11/protocol-jungle-of-internet-multimedia.html' title='Protocol Jungle of Internet multimedia communication'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_j-OZz2I3T5A/SwGfEUEXuxI/AAAAAAAAAA4/cmEDlr_AZkI/s72-c/protocols.gif' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3668110850144182944</id><published>2009-11-09T15:31:00.004-05:00</published><updated>2009-11-24T03:30:36.856-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='SIP-XMPP'/><category scheme='http://www.blogger.com/atom/ns#' term='Interworking'/><category scheme='http://www.blogger.com/atom/ns#' term='Gateway'/><category scheme='http://www.blogger.com/atom/ns#' term='XMPP'/><title type='text'>SIP vs XMPP or SIP and XMPP?</title><content type='html'>(This post is unrelated to P2P, and describes the differences between the two sets of protocols SIP and XMPP. I have implemented both SIP and XMPP, as well as used several existing libraries for SIP and XMPP, so I can comment on the two sets of standards from a developer point of view as well)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;History&lt;/span&gt;&lt;br /&gt;SIP was invented to provide rendezvous for session establishment and negotiation on the Internet.  XMPP (or Jabber) was invented to do structured data exchange such as synchronous or active presence and text communication among group of people.  XMPP evolved from instant messaging and presence, whereas SIP evolved from Internet voice/video communication. Later, XMPP added support for session negotiation using the Jingle extension, and SIP community added extensions such as SIMPLE to support instant messaging and presence.&lt;br /&gt;&lt;br /&gt;Technically comparing SIP and XMPP is like comparing apples and oranges because the core protocols serve different purposes: session randevous/establishment vs structured data exchange. On the other hand, because of the extensions invented in both the protocol worlds, SIMPLE and Jingle, they now have overlapping functions, and can be compared. When one compares SIP vs XMPP, actually the comparison is SIP/SIMPLE vs XMPP for IM and presence and/or SIP/SDP vs XMPP/Jingle for session negotiation. Even though the goals of the two sets of protocols are converging, there are fundamental architectural differences that I will enumerate in this article. There are other articles on SIP vs XMPP [&lt;a href="http://blog.isode.com/2008/02/sip-and-xmpp--.html"&gt;1&lt;/a&gt;, &lt;a href="http://www.infoworld.com/t/platforms/xmpp-vs-simple-race-messaging-standards-295"&gt;2&lt;/a&gt;, &lt;a href="http://www.scribd.com/doc/193634/Jabber-Inc-SIP-RTP-XMPP-White-Paper"&gt;3&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Differences: SIP vs XMPP&lt;/span&gt;&lt;br /&gt;The following table lists the crucial differences between the two sets of protocols.&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;SIP&lt;/td&gt;&lt;td&gt;XMPP&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Purpose&lt;/td&gt;&lt;td&gt;Provide rendezvous for session establishment and negotiation where the actual session is independent, e.g., over RTP media transport.&lt;/td&gt;&lt;td&gt;Provide a streaming pipe for structured data exchange between group of clients with the help of server(s), e.g., for instant messaging and presence&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Protocol&lt;/td&gt;&lt;td&gt;Text-based request-response protocol similar to HTTP, where core attributes are signaled using headers, and additional data using message body, e.g., session description of capabilities.&lt;br /&gt;&lt;/td&gt;&lt;td&gt;XML-based client-server protocol to create a streaming pipe on which it sends request, response, indication or error using XML stanza between client and server, and between servers.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Transport&lt;/td&gt;&lt;td&gt;Usually implemented in connection-less UDP as well as connection-oriented TCP transport. Also works over secure TLS transport.&lt;/td&gt;&lt;td&gt;Works over connection-oriented TCP or TLS transport.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Connection&lt;/td&gt;&lt;td&gt;A user-agent is both client and server, hence can send or receive connections, in case of TCP or TLS. This does not work well with NATs and firewalls, hence extensions are defined to use reverse connections when server wants to send message to client.&lt;/td&gt;&lt;td&gt;The client initiates the connection to the server, which works well with NATs and firewalls. Additionally, extensions are defined such as BOSH to carry XMPP stanza over HTTP to work with very restricted firewalls&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;There are many other differences, e.g., the way a URI is represented, or how authentication is done, or what kinds of messages are supported. I will not go into details of those since they tend to become too specific for the kind of application and we miss the important points. From a developer's point of view 'ease of programming' is very important.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Ease of programming&lt;/span&gt;&lt;br /&gt;Both SIP and XMPP are easy to implement. My 39 peers project has modules for both in few thousand lines of Python code. Although the basic protocol is easy to implement, a complete system such as a collaboration client with audio/video and messaging/presence support is very complex.&lt;br /&gt;&lt;br /&gt;Because of the way these protocols have originated, they are well suited for certain kinds of applications. For example, if you want to build an audio/video communication system, it is better to start with SIP. Features such as interoperability with other VoIP phones, incorporating any-cast call distribution, or using existing VoIP provider for trunking are easy and readily available using SIP. If you want to build an instant messaging and presence client, it is better to start with XMPP. Features such as friends roster, group chat, blocking a user, storing offline messages, etc., are readily available using XMPP. Any advanced communication or collaboration system needs to include both these kinds of features.&lt;br /&gt;&lt;br /&gt;XMPP has solved the application's problems and has defined mechanisms for several commonly used features in an instant messenger-type or shared state-type application, e.g., group chat, visiting card, avatars, etc. The emphasis is on application design, use cases, and practical solutions.&lt;br /&gt;&lt;br /&gt;I think there are two main reasons for SIP's difficulty among developers: (1) the emphasis of SIP is on interoperability rather than application and feature design, and (2) the emphasis in SIP community is to have one protocol solve one problem, which requires implementing a plethora of protocols for a complete system. Let me explain these further.&lt;br /&gt;&lt;br /&gt;When a new VoIP features is implemented by one phone, it must interoperate with another phone or VoIP service provider. Hence most SIP extensions focus on wire-protocol and interoperability mechanisms. Although specifications of several SIP extensions are available, there are no evaluation or open reference implementation on how they fit in the overall design. More recently efforts have been made, including my &lt;a href="http://tools.ietf.org/html/rfc5638"&gt;RFC 5638&lt;/a&gt; (Simple SIP Usage Scenario for Applications in the Endpoints), to simplify the specifications for certain types of SIP applications -- those endpoints that want to work in web and Internet world without the legacy of the traditional telephony systems.&lt;br /&gt;&lt;br /&gt;Secondly, SIP community tries to keep one protocol to solve one problem. Some extensions deviate from this guideline, but they are exceptions. The problem comes when this design principle involves implementing several distinct protocols just to get a complete system. For example, a SIP system incorporates other external mechanisms such as STUN, TURN, ICE, reverse-connection-reuse and rport-based symmetric request routing to solve the NAT and firewall traversal problem, and still does not guarantee media connectivity in all scenarios unless HTTPS/TCP tunnel in used. Implementing instant messaging and presence involves implementing several RFCs and drafts related to Event, PUBLISH, CPIM, PIDF, XCAP, MSRP, and still the application does not have all the features of commonly available XMPP client. In summary the SIP community has created numerous extensions for solving several problems in a way that scares away a new developer!&lt;br /&gt;&lt;br /&gt;As you can see, both these reasons (emphasis on interoperability and one-protocol-one-problem) are ideal in theory. So what is wrong? The practice. To solve these problems, (1) IETF working groups should not proceed with a draft without an open-source and simple &lt;span style="font-style: italic;"&gt;reference implementation&lt;/span&gt;,  (2) IETF working groups should build &lt;span style="font-style: italic;"&gt;reference applications&lt;/span&gt; combining several protocols for different kinds of applications and evaluate (a) consistency and (b) ease of programming.&lt;br /&gt;&lt;br /&gt;Consistency indicates whether the new extension is consistent with existing guidelines, best practices, protocol format, as well as design principles. For example, if an extension incorporates a new processing in the server which could have been done in the endpoint, then it is against the principle of intelligence in the endpoints. Such extensions should be marked as such so that developers know the trade-off. There are only a few good design principles, hence creating a consistency matrix of extensions against principles should be easy.&lt;br /&gt;&lt;br /&gt;Ease of programming is determined by three things: (1) how easy it is to implement the set of protocols, (2) how easy it is to build a real application using those protocols, and (3) how easy it is to build the real application using existing platforms and tools. The first is usually available as a software library, the second as an application and the third is re-usability. It should be easy to not only build the library but also use the library to build a usable application. Every new extension adds new things to the library, which cause more interaction in the application and hence more complexity. When a software project is started, usually the interoperability is not the highest requirement, but the re-usability, short development time and real prototype application are crucial requirements. Once the project is started on one path, it is very difficult to change the path by changing the core communication protocol. If there are reference implementations then not only they help you get started quickly but it also becomes easy to see how much additional complexity a particular SIP extension brings to the application. An important programming quote: &lt;span style="font-style: italic;"&gt;less is better than more&lt;/span&gt;!&lt;br /&gt;&lt;br /&gt;The flexibility of SIP also comes with its limitations. For example, SIP is flexible to support both UDP and TCP transport. However, UDP is treated as a second-class citizen by many programming languages or libraries even today, e.g., Tcl didn't support built-in UDP socket when it came out, and Adobe ActionScript does not have built-in UDP sockets for Flash Player even now. This prevents a developer from building a complete SIP stack as Flash application, for example. However, if you peek further, you would expect that if UDP is not supported then the platform is not suitable for real-time communication anyway. However, this does not prevent web-style developers to implement XMPP in ActionScript, and perhaps tweak it to support signaling of media sessions as well. The result is a broken or non-interoperable software application.&lt;br /&gt;&lt;br /&gt;Reviewing the evolution of SIP vs XMPP specifications, I think XMPP has defined an architecture that allows adding new extensions easily and hence reduces the application complexity, whereas SIP extensions have focused on interoperability and wire-protocol without much needed attention to application design. While application design may seem unnecessary for protocol specification, it is very important in the short term. Consider a developer who uses some data structures for representing protocol elements.  If a new extension is defined in XMPP, and it reuses the existing XML format that gets readily mapped to the data  structures, it becomes very easy to incorporate this new extension in his source code. If a new extension is defined in SIP or SDP, which re-uses an existing mechanism of another protocol for which there is no real implementation available, then the developer will first have to implement that other mechanism, then integrate it with SIP or SDP. The mechanism may have its own formatting which needs to be incorporated in the data structures. Essentially the developer will have to spend more time implementing such an extension.  In the end, the actual format of the message whether text-based or XML-based is not terribly difficult once you have a library for message formatting and parsing. However, if an extension uses a different format, connections, sessions, etc., that are not readily available in existing libraries and tools, complexity arises. For example, adding ICE to SIP/SDP created custom format whereas ICE in XMPP/Jingle re-used XML. Another example is how an particular endpoint is identified in XMPP vs SIP. In XMPP the URI itself is extended to include the resource, e.g., "user@domain/resource", whereas in SIP new extension such as globally routable user-agent URI (GRUU) is defined which is, well, more programming effort!&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Scalability and performance&lt;br /&gt;&lt;/span&gt;SIP is inherently a peer-to-peer protocol whereas XMPP is inherently client-server. Tasks that are easy in client-server systems such as shared state, roster storage on server, or offline messages on server, are well accomplished with XMPP. On the other hand, one of the primary goal of SIP is to keep the intelligence in the endpoint. Ideally, a SIP proxy server does not even maintain the session state for the SIP dialog. Few messages in SIP such as REGISTER and PUBLISH are intended for client-server communication. In XMPP, server is a must and all signaling communication goes through the server. There are message semantics defined for the types of messages, e.g., client-server information query, client-server-client message sending, client-server event publishing and server-client event notifications. Clearly client-server applications are limited by scalability and performance of the server. For example, an instant messaging session need not go through the SIP server saving bandwidth and processing at the server. But that means you lose the offline message storage feature at the server. In real SIP applications today, servers have become an integral part of the system and hence the scalability difference diminishes. In fact, the bulky message format of SIMPLE makes it less scalable than XMPP for presence updates that go through the server. Note also that although P2P-SIP is possible, a P2P-XMPP is not easy because XMPP is inherently client-server.&lt;br /&gt;&lt;br /&gt;Once we know this, we understand that SIP and XMPP systems solve two different problems, are designed for two different architectures and have evolved with two different guidelines. From here, you can do two things: either try to incorporate/translate all the features of one system to the other and eventually give up, or try to design your system that uses best of both worlds.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Interworking and co-location&lt;/span&gt;&lt;br /&gt;There have been interworking attempts to inter-operate SIP/SIMPLE and XMPP, especially the IM and presence part [&lt;a href="https://stpeter.im/index.php/2008/01/04/interworking-3/"&gt;draft-saintandre-sip-xmpp-*&lt;/a&gt;, &lt;a href="http://www.ietf.org/id/draft-veikkolainen-sip-voip-xmpp-im-01.txt"&gt;&lt;span style="text-decoration: underline;"&gt;draft-veikkolainen-sip-voip-xmpp-*&lt;/span&gt;&lt;/a&gt;]. The first reference shows how to implement a gateway to connect between SIP and XMPP networks, and the second shows how to implement a client that can support both SIP and XMPP and co-relate the two protocol messages if the user is connected to both servers by the same provider. The popular OpenSER (now OpenSIPs and Kamailio) SIP server has a Jabber module to inter-work with XMPP network. People have developed clients that can understand both SIP and XMPP. Interworking is complex, and not all features can be completely translated or used from one protocol to another, unless the protocol is changed a lot with custom hacks.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Conclusion&lt;/span&gt;&lt;br /&gt;Industry experts predict that both SIP and XMPP will stay for a long time. Rather than arguing about the differences or trying to mend the protocols to be like each other, one could build systems that use both these protocols for what each is good at. XMPP is good at creating application level streaming/secure/client-server pipes that can be used for shared state, one-to-many message delivery and publish-subscribe-notify-type use cases. SIP is good at rendezvous of session establishment and negotiation of session parameters for a separate session establishment.&lt;br /&gt;&lt;br /&gt;To interwork between XMPP and SIP, you could (1) use a gateway at the server to translate the basic functions, (2) learn or send SIP parameters over XMPP message from a client, or (3) use SIP to establish XMPP chat session with a client. For example, a multi-protocol client of user alice@example.net may be talking to bob@home.com over SIP, and discover that both clients support XMPP, and then add each other in XMPP roster or start an XMPP chat session. Alternatively, if they are chatting over XMPP and discover that the other supports SIP as well, then they initiate a SIP session to do multimedia call. Implementing both the protocols in the client is better than in the gateway for scalability and robustness. There are other interworking architectures possible, e.g., having two XMPP servers use SIP to communicate with each other or talk to a trunking provider, or having an integrated SIP-XMPP server that allows both SIP and XMPP users to seamlessly communicate with each other. These modes, however, are not interesting from a P2P point of view.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3668110850144182944?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3668110850144182944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3668110850144182944' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3668110850144182944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3668110850144182944'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/11/sip-vs-xmpp-or-sip-and-xmpp.html' title='SIP vs XMPP or SIP and XMPP?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-6941623529379566456</id><published>2009-10-29T17:42:00.004-04:00</published><updated>2009-11-24T03:31:10.084-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='reliability'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Reliability and scalability in P2P-SIP</title><content type='html'>In this article I list the important definitions that contribute to reliability and scalability of distributed systems, especially P2P-SIP. Scalability is defined as the ease with which a system can handle growth of demand (load, users, requests, etc.), and reliability is defined as the ease with which a system handles failure or loss (fault tolerant, availability).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Stateless&lt;/span&gt;: The amount of state stored in each networked component limits the scalability (as well as robustness). A truly stateless protocol or service does not need to store any state in the system. At the application level, a session oriented protocol such as RTSP or RTMP is stateful, whereas HTTP is stateless which can be made stateful using session cookies. At the transport level, TCP is stateful whereas UDP is stateless. SIP networked components come in several flavors: stateless, transaction stateful and session stateful. A stateful component has limited scalability not because of storage requirement of the state, but due to matching of a new request against the existing states -- this requires exclusive or read-write-locked access to shared resources and hence is slow. On the other hand a stateless request has all the information that is needed for handling that request. Secondly, a stateful component has limited reliability because if the component fails the state is lost and must be re-established, e.g., using new SIP transaction or new HTTP authentication. If you must use stateful component, then some form of distributed or replicated shared state is desirable.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Partitioning&lt;/span&gt;: Data or service partitioning among multiple machines helps in reducing load on each machine. A naive approach of using a hash function 'H(data.key) mod N' works for distributing data based on the lookup key among N identical and robust servers. However, the hashing algorithm remaps majority of the data if N changes -- new server is added or old one fails. On the other hand, &lt;span style="font-style:italic;"&gt;consistent&lt;/span&gt; hashing using a large hash space, e.g., MD5, maps both data using H(data.key) and machines using id=H(machine.name) in the same identifier space. A machine can then store the data whose keys are close to its own id. This principle is used in several structured P2P algorithms such as Chord, Pastry, CAN, and works especially well with large number of machines with significant amount of churn. Partitioning can be applied to services as well. Partitioning improves system scalability, but comes with an overhead of maintaining the correct partition when machines come and go.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Replication&lt;/span&gt;: Replication of data improves its reliability (also known as availability) as well as scalability to some extent. In a simple master-slave replication of data, write is done to the master which replicates the command to the slave, and read can be done from either master or slave servers. With higher failure probability, such as in P2P, you need more number of replicas. With N replica of some data, you can still get the data if N-1 machines holding the replica fail. Replication interacts with partitioning -- you want to keep access to the replica almost as fast as the original data. Typically a machine can replicate the data to its N (typically small, 2-8) neighbors and when the machine fails, the next data query automatically gets to its ex-neighbors. Note that another approach where a different hashing function, e.g., H(i+data.key), is used to store the i'th replica does not work well, because replication comes with an overhead of having the machines to keep the replicas up-to-date. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Redundancy&lt;/span&gt;: Replication is an example of redundancy of data. Redundancy can be applied to servers and services as well. A stateless protocol or service helps in easily deploying redundant server nodes. Data redundancy goes beyond simple replication of data object to N places. Suppose a large file is split into M chunks such that only N chunks are needed to reconstruct the original file, N&lt;=M. Assuming that many data sources are fast, this approach not only improves reliability but also performance in terms of how quickly you can download the file. Such techniques are used in P2P file sharing applications, and can be applied to P2P-SIP as well, e.g., for storing video mails or live streaming. Redundancy improves reliability as well as scalability of the system.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Load sharing&lt;/span&gt;: Load sharing is one type of redundancy where the load of the system is shared among N redundant machines. Although not required, load sharing typically works in conjunction with partitioning, e.g., in two-stage SIP server farm where the first stage servers forward the requests to the second stage server clusters based on H(data.key). As with redundancy, a stateless protocol or service can be easily shared whereas a stateful system requires more work. P2P systems are inherently load shared among the participating peers. Load sharing improves system scalability.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Iterative vs recursive&lt;/span&gt;: Iterative request routing is one where a client sends a request to one destination, receives a response that redirects it to second destination, and so on. Recursive request routing is one where a client sends a request to one node, which sends request to another node, and so on. Iterative vs recursive is also called as redirect vs proxy. Clearly, iterative poses less load on the networked element but more on client, whereas recursive is opposite of that. If scalability of networked element is desired then iterative should be preferred. However, NATs and firewalls make iterative request processing difficult on the Internet. Secondly, the topology, bandwidth and connections among the networked elements sometimes make recursive routing faster and more efficient in practice than iterative. The decision to go iterative vs recursive affects the number of message that needs to be handled as well as the state carried in the message or stored in the networked element. It is easier to incorporate redundancy in iterative mode, since only the client needs to re-try to redundant destinations.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Keep-alive&lt;/span&gt;: Network protocols usually have periodic keep-alive messages to ensure connectivity or to detect failures. Stateful Transport protocol such as TCP has built-in keep-alive mechanisms that can be activated using socket API. Application protocols employ some kind of application level keep-alive, e.g., XMPP has an extension to do ping, SIP has session timers. These not only detect failure due to network but also due to server software crash. The keep-alive messages help in improving the reliability of the system by quickly detecting complete or partial failures. Keep-alives are especially important in P2P network because of the large number of network paths that can fail.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Exponential back-off&lt;/span&gt;: After a failure has been detected, a reconnection or resend is attempted. The time for such attempts needs to be backed-off exponentially -- if the failure happens at time 0, then send first attempt at t, say t=0.5s, and subsequent ones at 2*t, 4*t, 8*t, and so on, until it reaches a cut-off, say 5 min. After that keep attempting periodically, say, every 5 min. If the failure is transient or one time, then this mechanism quickly reconnects. On the other hand, if the failure is longer term, then it reduces the load and bandwidth for reconnection attempts. SIP uses exponential back-off when sending subsequent requests in the event of failure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Request-response&lt;/span&gt;: Application protocols typically come in two flavors: send-and-forget and request-response. Most signaling and control protocols such as HTTP and SIP follow the request-response architecture. Media streaming protocols such as RTP usually don't send response for every message. A request-response protocol automatically makes the entity sending the request a client and the entity receiving the request a server. The client-server distinction may be on per-transaction basis, e.g., SIP has each endpoint act as user-agent-client and user-agent-server. Similarly, in P2P network every peer acts as both client and server, while there may be some nodes behind NAT that can act only as client. A request-response architecture is needed where reliability of the message delivery is important or when RPC semantics is desired in the protocol. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Redundant connections&lt;/span&gt;: Redundant connections are another form of redundancy found in distributed systems. For example, a client may be connected to multiple servers. It periodically pings the server and selects the best server to actually send the request to. If the best server fails, it can fail-over to the next best server. Redundant connections improve both reliability and performance of the system. Redundant connections are also useful with geographically distributed server farms to locate the closest server. The idea is to dynamically adapt instead of having statically configured connections.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Bi-directional master-slave&lt;/span&gt;: Master-slave replication is used for data reliability and to improve scalability of read-dominated applications, as mentioned before. In a bi-directional master-slave configuration, both machines act as both master and slave at the same time. Any write to any of the machine gets propagated to the other, and hence any read can be done from any of the machines. This improves scalability for write-dominated applications also, such as SIP server. The bi-directional replication can be extended to more than two machines by incorporating a circular ring topology of replica machines. The bi-directional replication comes with the over head of having to maintain replicas on all the machines, and some way to solve a race condition where two updates to two different machines result in eventual consistency. For certain types of data, such as SIP contact locations of the users, it is possible to achieve such consistency.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Vertical vs horizontal scalability&lt;/span&gt;: A lot of time you will hear people talking about vertical vs horizontal scalability. Vertical scalability implies that when the load increases you identify the bottleneck and add a new component (e.g., CPU, memory, disk) to your machine to improve the scalability. Horizontal scalability implies that you design the system such that when the load increases you add another machine in the network to handle the load, e.g., another server in the server farm. P2P systems are horizontally scalable. Clearly vertical scalability has a limit beyond which it may not scale, whereas server farms can scale linearly by partitioning.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Proxy and Cache&lt;/span&gt;: Caching improves performance and scalability of the system. DNS has epitomized the concept. Caching is also used in web and media streaming protocols. The idea is to install a cache in an intermediate networked element which uses the cache instead of sending subsequent requests to the actual destination. HTTP is designed to heavily use caching to improve performance and scalability of the servers. Caching a negative response is more challenging since the time-to-live for the cached entry is unknown. On the other hand, real-time communication protocols such as SIP and RTP have limited use for caching in the network. Nevertheless caching of data improves the performance and scalability of the SIP server, e.g., by using in-memory cache of the SIP contact locations instead of reading from the database every time. With caching comes the overhead of maintaining consistency among redundant data.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Crash vs byzantine failure&lt;/span&gt;: Crash and byzantine failure are two types of failure models: a crash indicates that the system has stopped working and it may be possible to detect its failure, e.g., using keep-alive; whereas a byzantine failure indicates that the system does not behave consistently as per the agreed protocol or algorithm all the time. Hence, it is very difficult to detect byzantine failures, especially if it is due to malicious intention. Byzantine failure and malicious node problem is especially important in P2P since the peers may not trust each other. Both crash and byzantine failure reduce the reliability of the system. Various mechanisms mentioned in this article help mitigate the crash failure, but do not help much with the byzantine failure.&lt;br /&gt;&lt;br /&gt;Readings:&lt;br /&gt;[1] Singh, K. and Schulzrinne, H., "&lt;a href="http://dx.doi.org/10.1016/j.comcom.2006.08.037"&gt;Failover, load sharing and server architecture in SIP telephony&lt;/a&gt;", Computer Communication 30, 5 (Mar. 2007), 927-942. DOI= http://dx.doi.org/10.1016/j.comcom.2006.08.037 [&lt;a href="http://kundansingh.com/papers/sipload-extended.pdf"&gt;Author's copy&lt;/a&gt;]&lt;br /&gt;[2] K.Singh, "&lt;a href="http://kundansingh.com/papers/thesis.pdf"&gt;Reliable, scalable and interoperable Internet telephony&lt;/a&gt;", PhD Thesis, Computer Science Department, Columbia University, New York, NY 10027, June, 2006.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-6941623529379566456?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/6941623529379566456/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=6941623529379566456' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/6941623529379566456'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/6941623529379566456'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/10/reliability-and-scalability-in-p2p-sip.html' title='Reliability and scalability in P2P-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-958417136312176720</id><published>2009-10-23T14:02:00.006-04:00</published><updated>2009-11-24T03:31:36.192-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='authentication'/><title type='text'>Security in P2P-SIP</title><content type='html'>I frequently receive questions on security in P2P-SIP, mostly from researchers looking for a new topic to explore. Security in P2P-SIP (and in P2P in general) is a challenging problem. In this article I summarize my understanding of the challenges and open problems.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://kundansingh.com/papers/nyman04.pdf"&gt;first&lt;/a&gt; literature on P2P-SIP mentions that P2P-SIP needs to solve the challenges of client-server Internet telephony as well as privacy, confidentiality, malicious node behavior and "free riding" problems of P2P. For example, a malicious node may not forward the call requests correctly or may log all call requests for future misuse. A later &lt;a href="http://www.jcbroadband.com/Library/jcbvoip5.pdf"&gt;publication&lt;/a&gt; formally identifies the key security challenges and potential solutions. The &lt;a href="http://www.cs.vu.nl/~steen/papers/2009.acm-cs.pdf"&gt;paper&lt;/a&gt; on survey of DHT security techniques presents a comprehensive listing of challenges, solutions and problems. Let us classify the challenges:&lt;br /&gt;&lt;br /&gt;User Authentication: Similar to client-server SIP, authentication is essential. A receiver need to verify that a sender posing as sip:bob@example.net is actually the owner of that identifier. If the user identity is based off some other information or identity owned by the user, e.g., email address, phone number, postal address, social-security number, credit card number, PKI, X.509 certificate. etc., then it is possible to delegate the identity to that mechanism, e.g., by sending email or phone caller ID verification. The challenge can be further divided into: whether the user owns the identity? whether the user can randomly pick his identity to anything? whether a user can be made to believe that he has (wrong) ID or password? whether a malicious user can get password from another user in the pretext of authentication so that the malicious user can later assume the other user's identity?&lt;br /&gt;&lt;br /&gt;Node Authentication: Additionally, since a number of P2P algorithms use the node identity to locate a node or define data storage criteria, the node ID is also a candidate for spoofing. A receiver must verify that the sender owns the node ID that it is posing as. The problem can be divided into sub-problems: whether the node ID are randomly picked or self generated by (malicious) nodes or assigned securely by some authority? whether the node ID can be spoofed in the protocol messages or data storage? whether a node can be made to believe by other nodes that it has (wrong) ID?  whether a malicious node can get the authentication credentials of another node and later assume other node's identity? These problems if not addressed can result in other problems such as man-in-middle or denial-of-service (DoS) attacks. The most important question is: Can authentication be done in P2P without relying on central trusted authority? &lt;br /&gt;&lt;br /&gt;Overlay Routing: A malicious node in a P2P network can drop, alter or wrongly forward a message intentionally manipulating the correct routing algorithm to disrupt the network and hence availability. This partly depends on node ID assignment mechanism, whether a node can intentionally place itself in the topology at a particular place? Further questions to ask: what fraction of malicious nodes affect what fraction of P2P network? What is the relationship between performance (availability, routing and data storage) of P2P and the fraction of malicious nodes or users?&lt;br /&gt;&lt;br /&gt;Overlay Maintenance: A malicious node may invite more malicious nodes or copies of itself in the P2P network. A malicious node may partition the P2P network so that one part can not reach the other. A malicious node may reject join requests from other good nodes to prevent them from joining the network. The questions: what fraction of malicious nodes can affect what fraction of P2P network availability? Can a malicious node eventually affect the whole network given enough time? Can a malicious node affect the discovery of bootstrap node by other nodes that affects the joining process of other nodes? Can a malicious node intentionally place itself in the topology at a particular place (e.g., as super peer), so that it affects more number of overlay messages?&lt;br /&gt;&lt;br /&gt;Free riding: A P2P network works because the peers do. If a node refuses to serve as a peer, but just use the service of the other peers, how do you handle this? Can the system enforce or give incentive to a particular node to become part of the overlay? What fraction of the nodes must be part of the P2P overlay for the overlay to work? &lt;br /&gt;&lt;br /&gt;Privacy, Confidentiality, Anonymity: Unlike the client-server telephony, in P2P-SIP the call signaling and media messages may traverse through other nodes in the system. Can other nodes know who is calling whom and hence infringe on user's privacy? Worse, can a malicious peer listen to the conversation (audio, video, text, etc.) between two other peers? Can the system allow you to make anonymous calls so that the receiver does not know who is calling? Can the system allow you to receive calls (e.g., any-cast calls to call centers) without divulging your identity to the caller?&lt;br /&gt;&lt;br /&gt;SIP services: The client server SIP implements several new features and services, but those have limited use in P2P-SIP because of the trust model. For example, programmable services using SIP-CGI or SIP Servlet are difficult, e.g., unless the receiving peer can completely trust the calling peer's CGI. Emergency services, spam prevention and lawful interception that have been researched in client-server SIP are  pretty challenging in P2P-SIP.&lt;br /&gt;&lt;br /&gt;Cost of security: Most of the existing protocols on the Internet suffer because people don't implement or deploy enough security. For examples, the front web page of many banks do not use HTTPS/TLS but have login forms. The reason is that system and operations engineers see security as an overhead, and do not use unless really needed. P2P takes this to extreme because the (in)security of one node can affect several in the network. The questions: What is the cost of security? How much does performance suffer in terms of number of messages, overhead, delay, for a particular security mechanism?&lt;br /&gt;&lt;br /&gt;Given these problem there are several approaches the researcher are taking in solving. However, the core of some of these problems still remain unsolved. The general approach is to define a sub-set of the P2P-SIP system which works for the given security mechanism. For example, the P2P authentication is very challenging -- hence most implementations use a central certificate authority (CA) and everyone trusts that -- similar to the web browser model which comes installed with some root CA. The other approach is to build a closed P2P network of trusted implementations and provide the service to the rest of the untrusted users, e.g., OpenDHT and Dynamo. This works similar to the server farm model, except that the server farm is built using sub-set of P2P features -- self adjusting, less configuration, distributed data storage, geographically distributed. Another approach is to build the closed and proprietary system and protocol which prevents (to some extent) others from injecting the malicious node in the system, e.g., Skype. Unfortunately, sooner or later the protocol gets reverse engineered and the security is not longer present. The research on distributed trust, reward, or credit/debit system works well for file sharing but has not be successfully proven for P2P-SIP. Finally, some researchers focus on the statistics and availability of the whole network, with the theory that a small fraction of malicious nodes do not disrupt the whole network. If there is enough incentive for the nodes to remain good, this may work well. &lt;br /&gt;&lt;br /&gt;If you are interested, please read the &lt;a href="http://www.eecs.harvard.edu/%7Emema/publications/iptps2004.pdf"&gt;article&lt;/a&gt; on when P2P makes sense? In particular, if (1) most of the peers do not trust each other AND (2) there is not much incentive to store the resources then P2P does not work well because the system does not evolve naturally to work. Think of it as people who do not trust each other and they do not have much incentive to help others or to store information for other people, will a person be able to get information he needs that another person has? The subset of problems I listed in the previous paragraph all try to twist the problem such that peers trust each other, i.e., (1) and the system tends to evolve naturally to work. Still more research is needed in (2) to identify and develop the incentive model for P2P-SIP use case.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-958417136312176720?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/958417136312176720/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=958417136312176720' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/958417136312176720'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/958417136312176720'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/10/security-in-p2p-sip.html' title='Security in P2P-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-5842521503821785069</id><published>2009-10-13T15:23:00.005-04:00</published><updated>2009-11-24T03:32:15.046-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='crossdomain'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='policy-file'/><title type='text'>The (in)security of Flash Player's crossdomain</title><content type='html'>This article discusses the (in)security of Flash Player's crossdomain or cross-domain-policy mechanism and why it is against P2P. Anyone who has worked with Flash Player's network (URLLoader, Socket, etc) to implement a protocol would know the problem and pain that Flash Player causes due to it (broken) crossdomain security.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;The problem:&lt;/span&gt; A programmable content downloaded from site1 running in user's browser should need explicit permission to connect to or use content from another site2. Otherwise, a Flash application may randomly connect to or use content from any other site without user's knowledge and pose security risks.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;The real problem:&lt;/span&gt; The real problem is in the way Flash Player tries to solve the above problem. If a Flash application is downloaded from site1, and wants to access or connect to another site2, then that site2 must give explicit permission using a web accessible http://site2/crossdomain.xml or in-line cross-domain-policy response in the TCP connection. The crossdomain.xml file lists all the other domains (such as site1) whose Flash applications are allowed to connect here, and also lists all the ports to which they can connect. There are options to give wild-card (*) for domains and ports. Thus, only if site2 trusts site1, will it allow site1 to connect.&lt;br /&gt;&lt;br /&gt;The first problem with this approach is the trust model: Flash Player asks site2 instead of the user for permission. This means user still does not have control what other sites the Flash application connects to; if site1 used wild-card domains then any application can connect to it; and admins of site1 and site2 must co-ordinate and collaborate. In most real deployment, this means that site1 and site2 are owned by the same entity and the deployment builds a false sense of &lt;span style="font-style:italic;"&gt;closed walled garden&lt;/span&gt; of client-server applications.&lt;br /&gt;&lt;br /&gt;The second problem is that it is very easy to work around: if site1 is wants to use or connect to site2, but site2 does not trust site1, then site1 can install a connection proxy on its site1 and have the Flash application connect to site2 via this proxy on site1. So it does not really protect site2 from access from any third-party Flash application -- i.e., there is no closed walled garden for site2. What it actually means is that if site2 has a content or service then anybody can build a Flash application to access that content or service as long as he can host a proxy on the Internet. You just need _one_ person in the whole Internet with good bandwidth and an open proxy to potentially break the crossdomain trust model of _all_ the Flash applications.&lt;br /&gt;&lt;br /&gt;The third problem is that it assumes Flash Player binary will not be reverse engineered: This is the worst sense of security in the literature. When a Flash application is downloaded from site1 by Flash Player, and wants to make connection to a another site2, it just checks the URL from where the Flash application was downloaded. If the URL contains the domain of site2, then usually the crossdomain check is not done, but if the URL does not contain site2, then it first gets crossdomain.xml file. If some person is able to hack and modify the URL variable inside Flash Player or in the transit on non-secure HTTP, then the Flash Player will effectively ignore the crossdomain check.&lt;br /&gt;&lt;br /&gt;The fourth problem is the way it has evolved: The initial implementations of crossdomain policy was broken, with lots of ways to work around. With newer implementations of Flash Player some of these problems have been fixed. However, that means the older mechanism is deprecated and newer mechanism no longer works with older. Concrete example of the problem follows. Should an application downloaded from http://site1/p/app.swf be allowed to connect to site1:5222 (non-HTTP port), or to http://site1 or http://site1/q/second.swf? If yes, then public hosting of personal web pages effectively open up all the Flash applications of all the hosted users on that server. So why not define a meta-policy at the top level http://site1 which controls everything else? How can site2 allow only one application from site1 but not other to connect? How can site2 allow only applications that are signed by site2 to connect but not others, even if those applications are hosted on other sites? No answer!&lt;br /&gt;&lt;br /&gt;The fifth problem is that it depends on other unsecured mechanisms: HTTP and DNS. I won't describe details on these, but it would have been nicer if the security was based on code signing or standard authentication mechanisms instead of comparing the hostname from the HTTP URL. &lt;br /&gt;&lt;br /&gt;The sixth problem is its incompatibility with existing systems: When implementing a custom non-HTTP protocol, the Flash Player sends the first request as &amp;lt;policy-file-request/&amp;gt; on the connected socket. What if the service does not understand this request? Well use meta-policy. In that case for it to work, the meta-policy needs to be served from say, 843. What if another server is present on the machine on port 843? Or what if you want to add handling of this policy-file-request in your own service protocol? In my personal experience getting this in an XMPP server was a nightmare. What if Flash Player puts a nul character after its request? You cannot control the request or response for policy-file from ActionScript because it happens automatically in Flash Player before a socket is actually connected in the ActionScript.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;The solution:&lt;/span&gt; In the context of open services and/or open network and/or P2P, the crossdomain is a problem rather than a solution. If site2 has hosted an open service, it should not restrict anybody to build applications to connect to site2. Thus site2 should put a crossdomain.xml with wild-card domains and ports. If site1 has a built a Flash application for an open service, the application should be allowed to connect to any service that follows the protocol as long as the user approves the connection. Thus site1 should build Flash application with correct user authentication, and then proxy the connection via site1's proxy to the site2's service instead of having it connect directly to site2. This allows site1 and site2 to operate independent of each other as long as they implement a common service protocol, e.g., XMPP or P2P-SIP.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;The real solution:&lt;/span&gt; A correct security implementation in Flash Player should have done something like the following:&lt;br /&gt;1. When the Flash application tries to connect to another new site, ask the user (similar to the camera and microphone security settings) if he wants to allow the connection. Also give an option to remember the approval if needed.&lt;br /&gt;2. Allow site2 to sign a Flash application, and require that only signed Flash application with so-and-so root certificate should be allowed to connect to site2. &lt;br /&gt;Beyond these any site should be free to implement its own authentication mechanism.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-5842521503821785069?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/5842521503821785069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=5842521503821785069' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5842521503821785069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5842521503821785069'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/10/insecurity-of-flash-players-crossdomain.html' title='The (in)security of Flash Player&apos;s crossdomain'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-4537714800423856525</id><published>2009-09-04T23:36:00.004-04:00</published><updated>2009-11-24T03:33:27.827-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='multimedia'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='videocity'/><category scheme='http://www.blogger.com/atom/ns#' term='idea'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='softcard'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>The Internet Video City</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://videocity.googlecode.com/svn/trunk/data/visitingCard.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 120px;" src="http://videocity.googlecode.com/svn/trunk/data/visitingCard.png" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;(For the last month and half, I have been aggressively involved in another open source project, "videocity". This article describes the salient features and novel ideas in that project.)&lt;br /&gt;&lt;br /&gt;The goal of the &lt;a href="http://code.google.com/p/videocity/"&gt;Internet video city&lt;/a&gt; project is to provide open source software tools, both client and server, for video communication and sharing. Unlike other file sharing systems, this is targeted towards video and live video sharing in small groups. Unlike other video communication services, this project provides the tools needed to build a service. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;High level description&lt;/h3&gt;&lt;br /&gt;At the high level, the video communication is abstracted out as a city. An individual can signup with his email user@domain.com and own a home with URL of the form http://server:5080/user@domain.com. This is also the location of the default guest room of that user. The user can build other rooms inside this URL, e.g., for hosting a online family gathering, he can get a room with name "Family Gathering" and the room URL of the form http://server:5080/user@domain.com/Family.Gathering. Each room can be made public or private. A public room is accessible to anyone visiting the URL of the room, whereas a private room needs explicit permission to enter.&lt;br /&gt;&lt;br /&gt;Once you have entered a room, you see other members in the room, and can communicate with others using real-time audio, video and text chat. You can share media files such as photos and videos from your computer with others in the room. You can also share online photos and videos with others. All these shared resources are put in an active session and would disappear when the room is closed, i.e., all members have left the room.&lt;br /&gt;&lt;br /&gt;The owner of the room can decorate his room by uploading, recording or editing the room's content. A room's content is described using an XML file containing multiple play lists. Each play list contains sequence of media files or URLs. When you enter a room, you see all the pre-configured play lists in that room. This allows the owner to, for example, create a room with his family pictures and videos in a slide show, and give out the URL to others to view the photos. A media resource in a play list can be text, image or audio/video. The image and audio/video can be uploaded from user's computer, downloaded from a web URL or recorded using user's camera in real-time. The play list can be readily edited using drag-drop, built-in text editor or various button controls.&lt;br /&gt;&lt;br /&gt;Each signed in user also has an inbox. The inbox is a special XML file that gets loaded when a user logs in, and contains play lists that are sent by other users to this user. When you enter a room, you have an option to send a play list to the owner of the room, which turns up in the owner's inbox. You can record the play list using your camera, or create one using resources available from the web. The play list stored in the inbox is privately available only to the owner of the room.&lt;br /&gt;&lt;br /&gt;This simple concept of play list and rooms, allows us to implement various communication scenarios. For example, real-time communication, video mails, publicly posted videos, and video web sites.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Novel idea&lt;/h3&gt;&lt;br /&gt;One of the novel concept used in the project is that of soft-card. A soft-card is a digital version of your ID card or visiting card. There are two types of cards: a &lt;i&gt;Private login card&lt;/i&gt; is your confidential ID card that you use for login to the site, an &lt;i&gt;Internet visiting card&lt;/i&gt; is your room's visiting card, which you give out to your friends so that they can visit your room. Usually each signed in person has a private login card, and each room owned by the person can have an Internet visiting card. &lt;br /&gt;&lt;br /&gt;A soft-card looks like a digital image of  your real ID and visiting cards. It is actually a image file in PNG format. The image has a photo, your name or your room's name, some list of key words identifying your room, and a URL of your room. Unlike a regular PNG file, a soft-card has additional meta information that is used in secure identification and access. In particular, your private login card has your RSA private key (refer to PKI) and your Internet visiting card has X.509 certificate using RSA public key signed by the server. These meta information such as keys, certificates, names, emails, keywords, etc., are stored in information chunks of the PNG file itself. &lt;br /&gt;&lt;br /&gt;Similar to public key cryptography, these digital files can allow us to implement security, authentication, access control, privacy, confidentiality, etc. Essentially, anything you can do with PKI, you can do with these soft-cards. Additionally, these soft cards give a visual appearance of an ID card or a visiting card containing the URL which they represent. Users receive them in email on signup, and can give out visiting card to others in email. An example visiting card is shown at the top of this article. If you edit the card's file or image in any way, e.g., converting to JPEG and back, or edit using photo editors, then the card's key information will become invalid and unusable. Note that a card is valid only within the domain it is created for. Thus a card created for http://server1/room1 can not be used by http://server2/room1 even if both server1 and server2 virtual domains are hosted by the same server. &lt;br /&gt;&lt;br /&gt;Once we have the login (private key) and visiting (public key) cards, implementing rest of the security mechanisms is straight forward. For example, resources in an inbox can be encrypted using public key of the owner, so that only a private login card can decrypt it. The public rooms are signed by owner's private key, so that anyone with the visiting card of the room can verify the signature. When sending a media resource to another user, PKI can be used to establish a secure session of communication. A room can be made private by allowing only connections from people who have valid visiting card for that room, and have the owner send out visiting card to his friends and family using an independent channel such as email. A room can be made public by uploading the visiting card to the room itself, so that anyone with the URL can first download the visiting card (i.e., public key) and use that to connect to the room. Although we haven't implemented most of the security mechanisms, we have the basic soft-card concept implemented in the project. In particular, you can create your cards, edit the layout of the card during creation, download them after creation, and use them to upload in the client to join a room or to log in. One thing to note is that within the Flash Player environment, the amount of security using PKI is limited. But since we have our own video server implementation as well, we can do some novel tricks in that regard.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Product design ideas&lt;/h3&gt;&lt;br /&gt;&lt;br /&gt;There are several product design ideas we implemented in the project: (1) consistency, (2) flowing and smooth interface, and (3) performance. In this section, I describe these ideas and how they are implemented.&lt;br /&gt;&lt;br /&gt;Consistency is very important in user interface design. The look and feel of various buttons should be consistent. Common operations should be consistent with what people are used to doing. For example, most windows users see the 'close', 'maximize', 'minimize' buttons on the top-right corner. Most mac users see the bottom bar as tools or commands bar. Most instant messaging users see notifications on the bottom-right corner of their screen. We used these concepts in our UI design as well.&lt;br /&gt;&lt;br /&gt;Flash allows us to implement nice, smooth and flowing user interface. When you go from one room to another, the view slides your window from one room to another. The sliding window component in the project nicely abstracts out the details of this container. When a help video is played, it animates to the full view, and when it is paused, it goes back to the original position. For help videos, flowing subtitles along with audio/video give a better user experience. Computer users are comfortable with drag-and-drop operations using the mouse. In our project, the play list editing, video window re-organizing, delete button, etc., use the drag-and-drop mode of operation. &lt;br /&gt;&lt;br /&gt;Performance is important once the project grows to a significant size. In particular, a Flash Player spends lot of cycles rendering images. This is improved significantly in our project since we use only programmatic skins for all our buttons and icons. Moreover, programmatic skins scale nicely when going to full screen or different size. &lt;br /&gt;&lt;br /&gt;There were a number of lessons we learned in this project from the product design perspective. Moreover, being responsible for both product design and product engineering helped us avoid ambiguity, which is usually seen in multiple team projects.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;The big picture&lt;/h3&gt;&lt;br /&gt;&lt;br /&gt;Although, the project is still "work in progress" and a lot of work is remaining, I wanted to give a big picture of the project. Flash Player is a great browser plugin. However being proprietary makes it hard for others to use it in full potential. For example, until recently the video communication was restricted to only Flash media server, or file upload were not allowed from local computer to Flash Player without going through the server. Although Adobe is making significant progress in keeping the developer community engaged, (e.g., making RTMP protocol open, or making file uploads and downloads available in new Flash Player) there will always be some restriction in the Flash Player. For example, absence of H.264 encoder or good audio quality/preprocessing engine prevents us from using it efficiently in true H.264 video communication or good real-time audio communication. In any case, since the RTMP protocol is open, and since there are a number existing open source RTMP implementations, one can use back-end RTMP based servers to perform some processing.&lt;br /&gt;&lt;br /&gt;This videocity project gives us back-end tools to intercept RTMP, integrate web communication, and expose a single server to support various requirements of video conferencing. One can ask whether this will scale? The answer is, may be, not. The reason for doing the project though is that it fits nicely in the big picture of P2P-SIP based communication framework. Flash gives a nice ubiquitous browser based front end, whereas our videocity server gives tools that can be integrated with peer-to-peer network. Thus we can gain from advantages of both worlds.&lt;br /&gt;&lt;br /&gt;Distributing a conference in a P2P network is an already researched problem. Several solutions exist, ranging from application level multicast for large conference, to full mesh small conferences, to picking a few servers as relay bridges. Maintaining shared distributed state of the conference and collaboration is interesting to explore. The SIP community has done significant work in centralized conferencing framework, e.g., in the IETF XCON working group. The P2P-SIP working group is creating protocol for standards based peer-to-peer network maintenance and lookup for SIP service. Finally, some API or interface specification is needed for the videocity's client-server model so that others can build clients or server adaptors to integrate between XCON, P2P-SIP and videocity. In particular, we will define all the interface elements such as format of the soft-card, various RPC calls for uploading or downloading resources, sharing play lists, authenticating users, as well as communication mechanisms.&lt;br /&gt;&lt;br /&gt;In summary, the project gives developers a starting point from where you can build video communication service, video message platform, video recording and editing system, collaboration engine, media sharing software, video blog web site, video rooms, multi-party conferencing applications, desktop clients, browser extensions, application sharing, new client applications, and so on. The client-server tools available in the project allow you to record a video or snapshot photo from your camera and store it in local file, create play lists of various heterogenous media resources, and share live and stored media with others using the system. &lt;br /&gt;&lt;br /&gt;There is no hosted service for this software, and we don't plan to have one. This is because our goal is to go peer-to-peer, where various installations of the software will discover and communicate with each other!&lt;br /&gt;&lt;br /&gt;Thank you for your reading time, and we love feedback!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-4537714800423856525?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/4537714800423856525/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=4537714800423856525' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4537714800423856525'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4537714800423856525'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/09/internet-video-city.html' title='The Internet Video City'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-4707054230163675686</id><published>2009-08-29T14:24:00.005-04:00</published><updated>2009-11-24T03:33:53.012-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><title type='text'>Beauty of open source</title><content type='html'>This article presents an analogy between software projects and beauty: open source projects have natural beauty, whereas commercial projects acquire beauty through cosmetics and makeup. [see &lt;a href="http://p2p-sip.blogspot.com/2009/05/why-open-source.html"&gt;previous article&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;table border="1"&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;Beauty&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Software project&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;Natural beauty is, well, natural, whereas cosmetics give artificial sense of beauty.&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Open source software are built when some motivated developer feels like building something, where as commercial software are mostly built by engineers who are paid and forced to build.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;Natural beauty is long lasting, whereas makeup wear down after few hours or days. &lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Open source software lives longer, whereas commercial projects tend to get out beaten by competition sooner or later.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;You are born beautiful and don't have to pay for natural beauty, whereas cosmetics cost money, big money. &lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Open source software are mostly free, whereas you have to pay for commercial software. &lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;It is hard for salesman to sell fruits and water to enhance your beauty, whereas it is easy for salesman to sell cream and nail-polish. Sometimes these advertisements are deceptive.&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Usually you don't find people advertising their open source work much, whereas companies have dedicated sales team to sell the commercial product. Sometimes these sales pitch are deceptive.&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;Natural beauty is usually open and does not hide scars, whereas cosmetics are meant to hide scars, marks, etc., to give a (false) sense of beauty.&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Open source software are open source code, where developers can jump right in the code and see things. Commercial software hides the source code, and instead presents documentation, power points, sales pitch, etc., to give a (false) sense of what is inside and actually hiding what is inside (source code).&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;br /&gt;Natural beauty does not require support. On the other hand if you are putting a nail polish, you will need a remover; if you are putting on make up, you will need to remove it before bed.&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;Open source software usually comes with no support. If you have it, you own it, and you put up with it. Commercial software usually comes with (expensive) support system. If you have it, you need to keep paying for bug fixes and upgrades. Otherwise it will harm you sooner or later.&lt;br /&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-4707054230163675686?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/4707054230163675686/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=4707054230163675686' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4707054230163675686'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/4707054230163675686'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/08/beauty-of-open-source.html' title='Beauty of open source'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2658870660877213539</id><published>2009-07-05T15:19:00.006-04:00</published><updated>2009-11-24T03:34:20.254-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='server'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='programmable'/><title type='text'>Programmable SIP server</title><content type='html'>I had posted an article on generic &lt;a href="http://p2p-sip.blogspot.com/2009/05/apis-for-sip-applications.html"&gt;SIP API&lt;/a&gt; earlier. I implemented a first version in my &lt;a href="http://39peers.net"&gt;39 peers&lt;/a&gt; project over the weekend. I also used the API to implement a simple SIP proxy and registrar server. I tested the server using X-Lite clients in an intra-net.  The basic SIP registration and call routing with record-route seems to work well. &lt;br /&gt;&lt;br /&gt;There are precisely two modules in the implementation: sipapi to implement the core of the API and sipd to implement the server, with 252 and 110 lines, respectively, of Python. The first version only supports simple call routing as needed for a server, without any advanced features such as NAT traversal. It also supports fail-over and load sharing models using the two stage SIP server farm as described in my &lt;a href="http://kundansingh.com/papers/thesis.pdf"&gt;PhD thesis&lt;/a&gt; (although with same set of servers performing both first and second stage for different set of users). &lt;br /&gt;&lt;br /&gt;In the implementation, the API exposes an Agent class that represents a listening endpoint. The agent dispatches various events such as "incoming" to signal an incoming message. The application, or SIP server in this case, attaches a local function to handle the "incoming" event and process the event. The processing logic is inspired by SIP Express Router (SER)'s config file. The following command creates the listening agent on the given listening IP and port using UDP transport.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  from app import sipapi&lt;br /&gt;  agent = sipapi.Agent(sipaddr=('192.168.1.3', 5060), transports=('udp',)).start()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Once created you can attach your function (say "route") to handle incoming messages. The handler function gets an event representing the incoming message, and acts on it using the methods available on "event.action" property.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;  def route(event):&lt;br /&gt;    if len(str(event)) &gt; 8192: return event.action.reject(513, 'Message Overflow')&lt;br /&gt;    if event.method == 'INVITE' and event.uri.user == 'anyone':&lt;br /&gt;      event.location = URI('sip:someone@somewhere.com')&lt;br /&gt;      return event.action.proxy()&lt;br /&gt;    ...&lt;br /&gt;  agent.attach("incoming", route)&lt;br /&gt;  sipapi.run()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The event object available to the incoming message handler has an agent property representing the original Agent object. This allows you to access state and configuration elements on the agent. The API also defines a Location class to store contact locations from an event, and retrieve contact locations for a URI. The incoming event has several action methods such as accept, reject, proxy, redirect, challenge. If an action is not invoked, then it calls the default action method that responds with '501 Not Implemented' response to the incoming message.&lt;br /&gt;&lt;br /&gt;The API and server can be extended to implement additional features such as NAT traversal, presence server, etc. For example, new event types can be defined to indicate presence change and allow the user to take action on these events. Alternatively, additional modules can define methods to modify SDP or SIP request to handle NAT traversal similar to how SER's nathandler module works. If you would like to work on these server extensions, do &lt;a href="mailto:kundan10@gmail.com"&gt;let me know&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2658870660877213539?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2658870660877213539/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2658870660877213539' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2658870660877213539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2658870660877213539'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/07/programmable-sip-server.html' title='Programmable SIP server'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-1317552951571535386</id><published>2009-06-27T17:27:00.003-04:00</published><updated>2009-11-24T03:34:57.544-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ActionScript'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Google Video'/><title type='text'>How does Google video chat work in gmail?</title><content type='html'>This post is just a speculation (aka guess work).&lt;br /&gt;&lt;br /&gt;Google has a &lt;a href="http://www.readwriteweb.com/archives/google_launches_video_and_voic.php"&gt;video chat&lt;/a&gt; function from within the Gmail web pages. This function is not available in the GTalk client yet. Google requires you to download a plugin which enables the video chat function form gmail. The video is rendered using Flash Player. In this article I present my understanding of how it works.&lt;br /&gt;&lt;br /&gt;Flash Player exposes certain audio/video functions to the (SWF) application. But the Flash Player does not give access to the raw real-time audio/video data to the application. There are some ActionScript API classes and methods: the Camera class allows you to capture video from your camera, the Microphone class allows you to capture audio from your microphone, the NetConnection/NetStream classes allow you to stream the video from Flash Player to remote server and vice-versa, the Video class allows you to render video either captured by Camera or received on NetStream. Given these, to display the video in Flash Player the video must be either captured by Camera object or received from remote server on NetStream. Luckily, ActionScript allows you to choose which Camera to use for capture.&lt;br /&gt;&lt;br /&gt;When the Google plugin is installed, it exposes itself as two Camera devices; actually virtual device drivers. These devices are called 'Google Camera Adaptor 0' and 'Google Camera Adaptor 1' which you can see in the Flash Player settings, when you right click on the video. One of the device is used to display local video and the other to display the remote participant video. The Google plugin also implements the full networking protocol and stack, which I think are based on the GTalk protocol. In particular, it implements XMPP with (P2P) Jingle extension, and UDP-based media transport for transporting real-time audio/video. The audio path is completely independent of the Flash Player. In the video path: the plugin captures video from the actual camera device installed on your PC, and sends it to the Flash Player via one of the virtual camera device driver. It also encodes and sends the video to the remote user. In the reverse direction, it receives video (over UDP) from the remote user, and gives it to the Flash Player via the second of the virtual camera device drivers. The SWF application running in the browser creates two Video objects, and attaches them to two Camera object, one each for the two virtual video device, instead of attaching it to your real camera device. This way, the SWF application can display both the local and remote video in the Flash application. &lt;br /&gt;&lt;br /&gt;What this means is that for multi-party video calls, either (1) the plugin will have to expose as more video devices (is there any limit on devices?), or (2) somehow multiplex multiple videos in same video stream (which is CPU expensive), or (3) show only one active remote participant in the call (which gives bad user experience).&lt;br /&gt;&lt;br /&gt;An open question to ask: will it be possible to use the Google's plugin to build our own Flash application and somehow use our own network application/protocol to implement video call? Hopefully Google will make the plugin API available to public some day.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-1317552951571535386?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/1317552951571535386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=1317552951571535386' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1317552951571535386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1317552951571535386'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/06/how-does-google-video-chat-work-in.html' title='How does Google video chat work in gmail?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-5868285512204605033</id><published>2009-06-22T01:55:00.005-04:00</published><updated>2009-11-24T03:35:47.578-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='Flash Player'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='AMF'/><title type='text'>Problems in RTMP</title><content type='html'>Adobe's RTMP or Real-Time Messaging Protocol was recently made available to public as an &lt;a href="http://www.adobe.com/devnet/rtmp/"&gt;open specification&lt;/a&gt; as part of Adobe's Open Screen initiative. Most of the protocol has already been implemented in third-party software such as &lt;a href="http://osflash.org/red5"&gt;Red5&lt;/a&gt;, &lt;a href="http://rtmpy.org/"&gt;rtmpy&lt;/a&gt; and &lt;a href="http://code.google.com/p/rtmplite/"&gt;rtmplite&lt;/a&gt; much before this specification became public. In this article I take a critical look at the protocol.&lt;br /&gt;&lt;br /&gt;There are three parts in the specification: (1) RTMP chunk stream, (2) RTMP message format and (3) RTMP command messages. At the high level, there are different types of messages such as command, data, audio and video. The last specification describes the high-level RPC (remote-procedure call) for various commands and their responses such as creating a network stream or publishing a stream. The actual formatting and parsing of individual types in a command are specified using AMF (Action Message Format) which comes in two flavors: AMF0 and AMF3. The messages that control the protocol such as setting the window size of lower layer or bandwidth for the peer, are specified in the second specification. Finally, the first specification defines the low level chunk format and separates the high level message stream from low level transport (chunk) stream.&lt;br /&gt;&lt;br /&gt;The first (and worst) problem with RTMP is that it is overly complex in doing what it does. One reason is that it was poorly designed without extensibility or competing peer protocols in mind, and later on "fixed" itself to extend new features. As an example of complexity: the chunk stream ID field in the first specification was initially intended to be up to 63 but later extended to 65599. For ID 2 to 63, the first byte stores the value in its most significant 6 bits. For ID in the range 64-319 the second byte stores the value minus 64, whereas the first 6 bits of first byte store 0. For values between 64-65599, the second and third bytes store the value using a complicated formula whereas the first six bits of the first byte store 1. Another example is the timestamp field which is 24-bits. However, the protocol supports 32-bits timestamp such that if the value is more than 24-bits than the 24-bits are all 1's, and the actual (extended) timestamp is stored after the header. What is surprising is that a binary protocol called RTP (Real-time Transport Protocol) existed before RTMP was conceived, and had well defined and well thought-of message layout. For example, RTP has version field for extensibility, and 32-bit timestamp. Unfortunately, RTMP didn't learn from the peer protocol and suffered in the form of excessive complexity.&lt;br /&gt;&lt;br /&gt;RTMP is designed to work only on TCP, and cannot work on UDP without several modifications. One well understood conclusion of early Internet multimedia research was that UDP is better suited than TCP for real-time media transport. While RTMP calls itself as real-time, it was designed to work solely on TCP. There is no sequence number to handle lost packets, hence it relies on the lower layer (TCP) to provide guaranteed packet delivery. Note that timestamp cannot be used to detect lost packets. The header optimization does not work if packets are delivered out-of-order. The new RTMFP does work over UDP but has its own &lt;a href="http://p2p-sip.blogspot.com/2009/02/rtmfp-vs-sip.html"&gt;set of problems&lt;/a&gt; and is not yet an open specification.&lt;br /&gt;&lt;br /&gt;RTMP has several unnecessary elements. The chunk stream mechanism is not necessary and actually hurts the performance of real-time media transport, besides complicating the implementation. In particular, for client-server communication where typically number of connections/streams between one client-server pair is one, there is no good advantage of using chunks. It can have advantage in server-to-server communication in avoiding head-of-line blocking of one stream from another. Secondly, the initial bulky handshake of RTMP which, I believe, was intended to measure bandwidth or end-to-end latency, actually is not useful.&lt;br /&gt;&lt;br /&gt;Media and control path should be separate. The IETF Protocols such as RTSP or SIP as well as ITU-T protocol H.323 exhibit this separation by delegating the media transport to separate RTP stream. This has several advantages because control path usually travels through application servers that are CPU and memory intensive, and have different scaling requirements than media servers which are bandwidth and disk intensive. Separating media from control path achieves scalability, robustness and distributed component architecture in the system. On the other hand, in RTMP control goes hand-in-hand with media. For example, the application server that handles shared objects and conference state, also handles media storage and transport. &lt;br /&gt;&lt;br /&gt;RTMP has inconsistencies. First example is the use of some data types. The stream ID field appears at several places in the protocol, in different forms: 32-bit little endian, 32-bit big endian, and 64-bit floating point number. Second example is incoherency between layers: The default chunk size is 128 bytes. The default real-time audio captured from microphone is streamed to the server using Nellymoser encoded audio packets with two frames per packet. Each Nellymoser encoded frame is 64 bytes. Besides, there is a one byte header indicating the codec type. Thus each packet in the default case is 129 bytes. Thus, under default operation, a Flash Player should immediately change the chunk size from 128 to 129 to accommodate a full audio packet in a chunk (so as to avoid fragmenting it which will be inefficient). Going off by 1 byte indicates that something went wrong while designing the protocol for the default case.&lt;br /&gt;&lt;br /&gt;When rest of the world was moving towards open standards such as RTP, Adobe embraced closed and proprietary RTMP. Adobe has been a proponent of proprietary technologies and imposing sub-optimal technologies to the developers and users. Another example is the RTMPE extension for encrypted RTMP communication. Readers are encouraged to read &lt;a href="http://ossguy.com/?p=398"&gt;this article&lt;/a&gt;: "The major implication of this takedown notice is that Adobe has definitively told us that a fully-compliant free software Flash player is illegal. This is because RTMPE is part of Flash, circumventing RTMPE is illegal (in the US at least), and Adobe will never give a key to a free software project since they cannot hide the key. As a result, Flash cannot truly be a standard..."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-5868285512204605033?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/5868285512204605033/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=5868285512204605033' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5868285512204605033'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5868285512204605033'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/06/problems-in-rtmp.html' title='Problems in RTMP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3257066197702114894</id><published>2009-05-23T06:12:00.006-04:00</published><updated>2009-11-24T03:36:15.705-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='event based'/><title type='text'>APIs for SIP applications</title><content type='html'>There are three types of SIP APIs: (1) source code API such as ones defined by &lt;a href="https://jain-sip.dev.java.net/"&gt;JAIN SIP&lt;/a&gt;, &lt;a href="http://www1.cs.columbia.edu/~kns10/software/libsipapi/"&gt;libsip++&lt;/a&gt; and &lt;a href="http://www.pjsip.org/pjsip/docs/html/group__PJSUA__LIB.htm"&gt;pjsip&lt;/a&gt;,  (2) high-level API to control the per-call behavior such as &lt;a href="http://www.ietf.org/rfc/rfc2824.txt"&gt;CPL&lt;/a&gt;, &lt;a href="http://www.ietf.org/rfc/rfc3050.txt"&gt;SIP-CGI&lt;/a&gt;, &lt;a href="https://sip-servlets.dev.java.net/"&gt;SIP Servlet&lt;/a&gt;, &lt;a href="http://www1.cs.columbia.edu/~xiaotaow/rer/Research/Paper/draft-wu-iptel-less-00.txt"&gt;LESS&lt;/a&gt;, or (3) pseudo-code style API to control the behavior of the server or client such as &lt;a href="http://www.iptel.org/ser/doc/gettingstarted"&gt;SER&lt;/a&gt; or &lt;a href="http://sipp.sourceforge.net/"&gt;sipp&lt;/a&gt; config files. They all serve different purposes. The source code API is needed to create new software applications using existing libraries, the high-level API creates easy to define services, and the config file allows creating server or client behavior/scenarios. &lt;br /&gt;&lt;br /&gt;With Python as the programming language, it is possible to create single software to expose these different types of APIs. This is because Python source code looks like human understandable pseudo-code if written with care. It makes sense to expose the APIs in my &lt;a href="http://39peers.net"&gt;P2P-SIP 39 peers project&lt;/a&gt; to support such behavior. In this post, I present some of my initial thoughts on how to implement the generic API.&lt;br /&gt;&lt;br /&gt;Firstly, the existing SIP module (&lt;a href="http://39peers.net/index.php?option=com_content&amp;view=article&amp;id=53&amp;Itemid=61"&gt;rfc3261.py&lt;/a&gt;)  already has an easy to use source code API. This is further enhanced in additional modules such as voip.py for user-agent specific functions. In particular, the source code API exposes object-oriented classes such as Transaction, Dialog, Stack, etc., to perform the functions defined in various layers in RFC 3261.&lt;br /&gt;&lt;br /&gt;Secondly, the high-level API such as CPL and SIP-CGI can be implemented using per-user scripts in python itself (instead of XML for CPL, for example). The SIP server or client can import the per-user script based on the request-URI of the request and handle the processing. An example script to perform redirect to voice mail is shown below:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;def redirect_to_voicemail(event):&lt;br /&gt;    event.location = URI('sip:jones@voicemail.example.com')&lt;br /&gt;    event.action.redirect()&lt;br /&gt;&lt;br /&gt;def proxy_then_voicemail(event):&lt;br /&gt;    if event['From'].uri.host.endswith('example.com'):&lt;br /&gt;        event.location = URI('sip:jones@example.com')&lt;br /&gt;        e = event.action.proxy(timeout=10)&lt;br /&gt;        if e in ('busy', 'noanswer', 'failure'): redirect_to_voicemail(event)&lt;br /&gt;    else: transfer_to_voicemail(event)&lt;br /&gt;    &lt;br /&gt;basic.addEventListener('incoming', proxy_then_voicemail)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Because of the un-safe nature of Python, we will lose a number of features provided by CPL, but nevertheless the programmable API is easy to read and write, and nicely integrates with the existing source code. Similar extensions can be built for user agent applications similar to the LESS programming API.&lt;br /&gt;&lt;br /&gt;Thirdly, instead of having the SIP client or server import the per-user script, the client or server itself can be written in the script. The following example shows how to translate some parts of SER config script to create a new server scenario.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;from sipapi import *&lt;br /&gt;import re&lt;br /&gt;&lt;br /&gt;_debug  = True  # enable debug trace&lt;br /&gt;open(('67.93.12.18', 5060)) # listen on this ip:port for incoming packets&lt;br /&gt;&lt;br /&gt;def route(event):&lt;br /&gt;    # sanity check section&lt;br /&gt;    if event['Max-Forwards'] and int(event['Max-Forwards'].value) &lt;= 0:&lt;br /&gt;        return event.action.reject(code=483, reason='Too many hops')&lt;br /&gt;    if len(str(event)) &gt; 8192:&lt;br /&gt;        return event.action.reject(code=513, reason='Message overflow')&lt;br /&gt;    # this is used by sipsak to monitor the health of server&lt;br /&gt;    if event.method == 'OPTIONS':&lt;br /&gt;        if event['From'].uri.user == 'sipsak' and not event.uri.user:&lt;br /&gt;            return event.action.accept()&lt;br /&gt;    ...&lt;br /&gt;basic.addEventListener('incoming', route)&lt;br /&gt;&lt;br /&gt;run()  # the loop to process the SIP listening point&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Similar scripts can be written to create client scenarios similar to how sipp creates scenarios from configuration files.&lt;br /&gt;&lt;br /&gt;I feel a generic SIP API will make the job of Python programmers easier, instead of having to learn new APIs and implement them in custom SIP servers and clients. In any case, the article just poses some ideas, and I will be happy to mentor any student who would like to work on this project!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3257066197702114894?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3257066197702114894/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3257066197702114894' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3257066197702114894'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3257066197702114894'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/05/apis-for-sip-applications.html' title='APIs for SIP applications'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2311512346028550720</id><published>2009-05-17T03:07:00.006-04:00</published><updated>2009-11-24T03:36:44.245-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Why Open Source?</title><content type='html'>In this article I present my view on Open Source Software. There are a number of articles elsewhere that talk about advantages of Open Source and why it works [&lt;a href="http://en.wikipedia.org/wiki/Open_source"&gt;1&lt;/a&gt;][&lt;a href="http://www.dwheeler.com/oss_fs_why.html"&gt;2&lt;/a&gt;][&lt;a href="http://blogs.sun.com/BVass/entry/the_no_1_reason_to"&gt;3&lt;/a&gt;]. (I use the term Open Source to mean Free Software, instead of diving into OSS vs FS debate.)&lt;br /&gt;&lt;br /&gt;Point 1: Most software organizations are driven more by business objectives, and less by technology.&lt;br /&gt;&lt;br /&gt;If you look around you notice several types of software companies: most of them are commercial ones like Microsoft, there are some like Sun that are commercial but pretend to do open source work, very few like Red Hat that build business out of open source software, and finally, there are some true open source software like Linux and Apache. I think most of Sun's Open Source attempts are for business reasons to provide competition to other existing businesses.&lt;br /&gt;&lt;br /&gt;Point 2: Software programming is an art.&lt;br /&gt;&lt;br /&gt;It is like learning a new language or doing sculpture or painting. Initially a programmer is too involved with the syntax and semantics of the language constructs and application. Once he is proficient, the programming comes naturally. For example, a good Java developer will think in terms of high level modules and classes, but once he starts implementing them things move smoothly instead of worrying about where to put the open bracket or when to create a new method. After a while, the software development process becomes an art of making beautiful, modular and efficient software. Some people have a built-in talent for the particular art, but most people learn by practice, practice and practice.&lt;br /&gt;&lt;br /&gt;Point 3: Good art requires personal motivation.&lt;br /&gt;&lt;br /&gt;Doing a scientific experiment is different than painting a new picture. At the broad-level, given the set of input data, and experimental setup, one is likely to achieve the same result in an experiment. On the other hand, an art piece requires inherent motivation of the artist and his personal inspiration to do something new, something different. Sometimes the inspiration gets driven by commercial interest, in which case an artist may end up producing art work without much motivation -- e.g., in the commercial Indian film industry, a music director has to produce tens or hundreds of scores per year, affecting the quality of the art as well as causing plagiarism from western music. On the other hand, motivated musicians who record one   album a year do generate quality and innovative work. &lt;br /&gt;&lt;br /&gt;Considering the above three points, a software engineer who is paid to work on a particular technology or piece of software is less likely to be personally motivated to create that piece of software. On the other hand, an open source developer who is not paid for his work initially, starts the open source work because of his personal motivation to create that piece of software. Thus, an open source software is more likely to be of better quality compared to a commercial software with the same amount of testing. Hence commercial software requires quality assurance to make it competitive with open source. Although quality assurance can reduce software bugs, it doesn't make the 'art' as good as the open-source version. If you compare the design of the Apache web server or SIP express router with their commercial counter parts, you can understand what I mean by 'art' in this context. &lt;br /&gt;&lt;br /&gt;When I look at an open source software, I assume it reflects ideas and vision of the innovative and inspired developers who were motivated to write that piece of software. When I look at a commercial software, I assume it is created by software engineers who got paid to write code, to write documents, to test code, and more than that it is sold by salesmen who are paid to sell those pieces of software. I don't see much personal motivation or inspiration in the loop, and I don't aspect the design or the implementation of the software to be a good piece of art. (In small companies where all the people have same vision or inspiration about the software, it is possible to create quality work. However, things change over a period of time as the company grows and new software engineers are paid to perform work on other people's innovations.)&lt;br /&gt;&lt;br /&gt;In the context of P2P-SIP, we haven't seen much of commercial interest beyond the few initial contenders. The main reason is that P2P-SIP inherently is peer-to-peer, and against the business model of services that the modern web/phone industry is so accustomed to. In other words, the industry has not figured out a way to make money out of open P2P-SIP. On the other hand, there are a number of developers who created open source prototype applications out of personal inspiration. The next steps for the open source P2P-SIP developer community: to build a bigger community, advertise and publish their work, prove that it works better than existing solutions, and use it on a daily basis!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2311512346028550720?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2311512346028550720/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2311512346028550720' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2311512346028550720'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2311512346028550720'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/05/why-open-source.html' title='Why Open Source?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2351790438702958340</id><published>2009-05-05T00:39:00.003-04:00</published><updated>2009-11-24T03:37:15.295-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Specification'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Document'/><category scheme='http://www.blogger.com/atom/ns#' term='RFC'/><title type='text'>Documenting RFC implementations</title><content type='html'>Just wanted to document how I am using the existing documentation such as IETF RFCs and Internet-Drafts in my &lt;a href="http://39peers.net"&gt;39 Peers&lt;/a&gt; project. You can looks at some samples at &lt;a href="http://39peers.net/download/python/doc/html/rfc2617.py.html"&gt;here&lt;/a&gt; or &lt;a href="http://39peers.net/download/python/doc/html/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I wrote a &lt;a href="http://39peers.net/download/python/src/tools/htmlify.py"&gt;htmlify.py&lt;/a&gt; script. The script uses the &lt;a href="http://silvercity.sourceforge.net/"&gt;SilverCity&lt;/a&gt; python package to decorate the base python code. To include the documentation from RFCs, the script interprets the python source code to identify lines such as&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;# @implements RFC2617 (HTTP auth)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;If found, it downloads the particular RFC from IETF website, removes any formatting empty lines such as near page breaks, and numbers all the pages and lines. It then stores the resulting file as a text document which you can lookup and use in your documentation.&lt;br /&gt;&lt;br /&gt;Furthermore, the script identifies lines such as&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;# @implements RFC2617 P3L16-P3L25&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;If found, the line is replaced by a HTML DIV block with content of the documentation in RFC2617 text file from page-3 line 16 to page-3 line 25. &lt;br /&gt;&lt;br /&gt;I find this technique pretty handy, and keeps my decorated source code with inline documentation extracted from RFCs and drafts, instead of I having to write extensive document.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2351790438702958340?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2351790438702958340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2351790438702958340' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2351790438702958340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2351790438702958340'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/05/documenting-rfc-implementations.html' title='Documenting RFC implementations'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3096580271230427367</id><published>2009-02-27T01:57:00.010-05:00</published><updated>2009-11-24T03:37:47.347-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='NAT'/><category scheme='http://www.blogger.com/atom/ns#' term='Firewall'/><category scheme='http://www.blogger.com/atom/ns#' term='ICE'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='Skype'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMFP'/><title type='text'>Problems due to NATs and firewalls</title><content type='html'>Network Address Translators (NAT) and firewalls create problems for end-to-end connectivity on the Internet. This not only affects P2P-SIP but also client-server SIP. In this article I post some example numbers to illustrate the point.&lt;br /&gt;&lt;br /&gt;These numbers are for example only: suppose there are 10% public Internet nodes, 30% nodes behind good (cone or address restricted) NAT, 30% nodes behind bad (symmetric) NAT and 30% nodes behind UDP blocking firewalls (F). Let's denote these as P=10%, G=30%, B=30%, F=30%. Here the public Internet nodes are typically from universities and research institutes, those behind good NAT are usually from residential DSL/Cable access, those behind bad NAT are partly from residential and partly from enterprise environment, and those behind UDP blocking firewalls are from enterprise and corporate networks. Suppose a call event between any two pair of nodes is independent of each other for the probability analysis purpose and nodes are equally likely to call any other node. Thus, percentage of calls between two public Internet nodes is (10%)^2 = 0.01 = 1%.&lt;br /&gt;&lt;br /&gt;Now let us enumerate the NAT and firewall traversal techniques available to SIP. STUN helps with good NAT, whereas TURN relay is needed for bad NAT. ICE is used to negotiate the connectivity using STUN or TURN bindings. A TCP-based relay (or even HTTP relay) is needed for UDP blocking and very restricted firewalls. (what about TCP hole punching and other techniques?) A STUN server is light in terms of bandwidth utilization, whereas a TURN relay needs high network bandwidth and hence costs the service provider more money. Same is the case with TCP-based relay.&lt;br /&gt;&lt;br /&gt;In a call if one participant is behind a UDP blocking firewall (F), then the call must use a TCP relay. This amounts to 1-(1-F)^2 = 51% calls going through TCP relay. &lt;br /&gt;&lt;br /&gt;In a call if both participants are behind bad NAT, then we need a TURN relay. This amounts to B^2 = 9% of the calls.&lt;br /&gt;&lt;br /&gt;If one participant in a call is either on public Internet or good NAT and other is on public Internet, good NAT or bad NAT, then the media can go end-to-end using STUN bindings. This amounts to 40% of the calls.&lt;br /&gt;&lt;br /&gt;In conclusion, the VoIP provider will need to host UDP or TCP relays for 51+9=60% of the calls. This is not  a good proposition.&lt;br /&gt;&lt;br /&gt;In real world, the call events are not independent of each other: probability of a corporate user calling another corporate user within the same corporation is high. Also probability of a home user calling another home user is also high. For example, a SIP service targeted towards consumers can expect to have most of the calls among residential users. Thus, the percentage of calls that can be end-to-end is much higher than 40%. Similarly, an enterprise VoIP system can expect to have mostly internal intra-enterprise calls, which do not need to cross the enterprise firewall. Hence the percentage of calls needing the relay is not as high as 60%. Let us analyze these two use cases separately.&lt;br /&gt;&lt;br /&gt;Suppose, for a consumer SIP service, the distribution of nodes is P=15%, G=50%, B=30%, F=5%, i.e., less number of users are from bad NAT or UDP blocking firewalls. In this scenario about 20% calls need a relay whereas 80% calls don't. &lt;br /&gt;&lt;br /&gt;In an enterprise VoIP system, suppose 60% calls are intra-office and 40% are with outside the office network, then only those 40% calls need a relay whereas 60% calls don't. In a properly engineered enterprise VoIP system, appropriate ports are opened for UDP as well as appropriate media relays are installed in DMZ which facilitates smooth media path for inter-office communication.&lt;br /&gt;&lt;br /&gt;While we can play with these numbers as much as we want, the fact remains that a significant percentage of calls need media relay, either UDP TURN relays or TCP relays. This puts unnecessary burden on the VoIP service provider to install and manage relays and buy network bandwidth for those relays, or simply disallow calls that require relay (in which case they may lose customers).&lt;br /&gt;&lt;br /&gt;In a peer-to-peer system with super nodes such as Skype, these super nodes can act as media relays and hence save a lot of bandwidth and maintenance cost for the provider. There are some things to consider though: a node behind public Internet can become UDP as well as TCP relay for any call, whereas a node behind good NAT can become only UDP relay with some workaround, but not a TCP relay. This puts too much burden on nodes behind public Internet.&lt;br /&gt;&lt;br /&gt;Let us consider the original example with P=10%, G=30%, B=30%, F=30%. In this case the 51% of calls that require TCP relay must use one of the 10% P nodes. When acting as a relay, the bandwidth requirement at the relay is twice that of when the node is in a call. Suppose each node makes N calls a day, and generally speaking needs bandwidth for N calls. However, a public Internet node not only needs bandwidth for its own N calls, but also for relaying 5xN calls of other users which amounts to total bandwidth for 11xN calls. Thus, while the super-node architecture is beneficial to the provider, it heavily punishes users on the public Internet. (My guess is that number of public nodes using VoIP are about 4-5%, which further burdens the public nodes).&lt;br /&gt;&lt;br /&gt;A managed P2P-SIP infrastructure can be a good alternative, where corporations and universities donate hosts/bandwidth on high speed network to act as relays/super-nodes. Alternatively, one can have an incentive system to promote hosts to become relays and super-nodes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3096580271230427367?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3096580271230427367/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3096580271230427367' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3096580271230427367'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3096580271230427367'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/02/problems-due-to-nats-and-firewalls.html' title='Problems due to NATs and firewalls'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-8103913685556363602</id><published>2009-02-24T22:16:00.010-05:00</published><updated>2009-11-24T03:38:13.294-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Comparison'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMP'/><category scheme='http://www.blogger.com/atom/ns#' term='RTP'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='RTMFP'/><title type='text'>RTMFP vs SIP</title><content type='html'>Adobe's RTMFP is &lt;span style="font-weight:bold;"&gt;not&lt;/span&gt; P2P-VoIP as exemplified by Skype. On the other hand, RTMFP is closer to client-server SIP or H.323 where signaling happens via a server and media path can be end-to-end between the endpoints. When people refer to RTMFP as P2P, it is more like 'end-to-end media' similar to client-server SIP. &lt;br /&gt;&lt;br /&gt;Why is RTMFP important? The previous Adobe protocol RTMP is strictly client-server even for media path. This gives poor quality for real-time media communication because media packets go from client to server, that too over TCP, and then are redistributed to the other client, again on TCP. End-to-end media based VoIP systems existed before Adobe implemented RTMP. I suppose the difficulty of NAT and firewall traversal and lack of interactive video communication requirement in Flash Player resulted in RTMP. Adobe corrected this mistake in the new protocol RTMFP which allows NAT and firewall traversal (to some extent) and allows end-to-end media path without going through the server. Although, the signaling is still going via the central server.&lt;br /&gt;&lt;br /&gt;Once we understand this difference between P2P-VoIP and RTMFP, lets enumerate the differences between an RTMFP-based and a client-server SIP-based communication system.&lt;br /&gt;&lt;br /&gt;1. RTMFP is a closed protocol, although Adobe recently opened up the previous RTMP. On the other hand, SIP is an open standard from IETF. This means anyone can implement SIP whereas only Adobe can implement RTMFP. That also means that a bug in the RTMFP protocol or its implementation is outside the scope of public review such as for security experts. &lt;br /&gt;&lt;br /&gt;2. RTMFP is an integrated protocol that has support for signaling, encryption, media flow (flow control and congestion control), NAT traversal. Whereas SIP is just one piece of the puzzle, that is used in conjunction with RTP/RTCP, SDP, STUN, TURN, ICE, SRTP, etc. to build a complete system. In that regard there is more scope for interoperability problems in SIP systems. The SIP interoperability test (SIPit) events have helped in solving interoperability problems among current products for over a decade. (see next point on why RTMFP alone may not be sufficient?)&lt;br /&gt;&lt;br /&gt;3. Based on the available documentation, RTMFP works on UDP. Whereas SIP can work on UDP as well as TCP. In an RTMFP application, the client should fall back to TCP-based RTMP if for some reason UDP is blocked for the client-server communication. This also means that the client will lose some of the benefits such as encryption available in RTMFP. There are other protocols RTMPS and RTMPE to facilitate security and encryption over TCP-based RTMP.&lt;br /&gt;&lt;br /&gt;4. Although RTMFP works on UDP, it implements additional flow control and TCP-friendly congestion control. This helps media traffic deal with network congestion and slow receivers. On the other hand most existing SIP system do not implement such mechanisms in the media path. While this looks like an advantage in RTMFP, it turns out to be a problem because of the way it is implemented. In particular, the network components are disconnected from the media source components such as camera and microphone. The rate control mechanisms are implemented in network components which internally slow down the media traffic by delaying or dropping the UDP media packets. On the other hand the encode quality settings on camera and microphone components are unaffected. This results in packet drops due to congestion and hence choppy video or audio drop-outs. A good application built on top of RTMFP is supposed to get feedback from network components and adjust the encode quality parameters (framerate, bitrate, quality) in the camera and microphone components so that the packet drops are reduced. Thus, unless the application is smart enough to deal with this, the disconnected implementation of rate control and media source causes quality problems in RTMFP.&lt;br /&gt;&lt;br /&gt;5. Both RTMFP and SIP can use media relays to workaround NATs and firewalls. However, RTMFP does not use a super-node architecture where some clients (Flash Player instances)  act as relays, whereas (P2P) SIP can use existing client nodes to act as media relays. This means that when using RTMFP, the service provider must bear all the bandwidth cost of the relays, whereas in (P2P) SIP the cost can be distributed among the users because of the peer-to-peer nature. I analyze the cost due to NAT and firewall traversal in my next post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-8103913685556363602?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/8103913685556363602/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=8103913685556363602' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8103913685556363602'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/8103913685556363602'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/02/rtmfp-vs-sip.html' title='RTMFP vs SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-7751283761643253156</id><published>2009-02-22T21:48:00.008-05:00</published><updated>2009-11-24T03:38:42.287-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='video'/><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Conferencing'/><category scheme='http://www.blogger.com/atom/ns#' term='Problems'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><title type='text'>Why does client-server video conference fail?</title><content type='html'>I analyze some problems in client-server communication for multi-party video conferencing.&lt;br /&gt;&lt;br /&gt;Audio communication differs from video in two important ways: (1) usually in a conference only one person is speaking at any time whereas everyone's video is on, (2) audio codecs are usually fixed bit-rate whereas video codecs adjust bit-rate based on various parameters such as available network bandwidth and desired frame-rate.&lt;br /&gt;&lt;br /&gt;Problem 1:&lt;br /&gt;In a client server mode, because video coming from one participant needs to be distributed to all the other participants, the bandwidth and processing requirement at the server can be higher; unlike audio where usually only one person is speaking. Secondly, the downstream video bandwidth requirement at the client increases with the number of participants in a conference. In an N-party conference, each client will have usually one outbound audio stream, one inbound audio stream, one outbound video stream and N-1 inbound video streams. Note that this problem is worse for peer-to-peer (P2P) video conference, where everyone is sending video stream to everyone else: in which case there are N-1 inbound and N-1 outbound video streams at each client. For asymmetric network access (ADSL or Cable), where upstream bandwidth is lower than downstream, this causes early saturation in outbound network bandwidth. Shutting down video stream or reducing the video quality while a person is not speaking saves some bandwidth especially for speaker mode conferences.&lt;br /&gt;&lt;br /&gt;Problem 2:&lt;br /&gt;Second point of difference is that audio is usually encoded using fixed bit-rate codec whereas video bit-rate is adjusted based on several parameters such as available network bandwidth, desired quality and frame-rate. In a client-server environment most implementations use the client-to-server network quality information to decide what bit-rate to use for client's video encoding. Consider a two party client-server conference, where first client is closer to the server hence has lower latency. The first client decides to use high quality high bitrate video encoding. On the other hand the second client decides to use low quality low bitrate video encoding. This asymmetry causes the first client to receive poor quality video whereas the second client's downstream link gets congested with high bitrate video. The problem is further aggravated if in a multi-party conference there is only one participant on poor quality network. The problem is caused because we use client-server network latency metric instead of end-to-end network latency metric in deciding the video encoding bitrate.&lt;br /&gt;&lt;br /&gt;Problem 3:&lt;br /&gt;Sometimes, the conference server imposes bitrate control to limit the traffic towards a low bandwidth client. However, for efficiency reason the server doesn't re-encode the video packets. Instead, it just drops non-Intra frames if there is not enough bandwidth. This causes marginal to no improvement primarily because Intra frames are several times bigger than other frames. Secondly, it causes choppy video which further degrades the experience. The layered encoding in MPEG solves this problem.&lt;br /&gt;&lt;br /&gt;Problem 4:&lt;br /&gt;Larger video packets may not traverse end-to-end over UDP. An encoded audio packet is usually small, of the order of 10-80 bytes per 20 ms. On the other hand an intra-frame video packet size can be much larger, say 1000-10000 bytes. When media packets are sent over UDP, and the packet size is large, there is high probability of getting the packet dropped. This is because of the MTU restriction and middle-boxes (NAT and firewall) in the media path. An UDP packet of size larger than MTU (typically approx 1300-1400 bytes) gets fragmented at the IP layer such that subsequent fragments after the first one do not have the UDP header information (such as source and destination port numbers). A port inspecting NAT or firewall that doesn't handle fragmentation correctly may drop such subsequent fragments, causing loss of the whole UDP packet at the receiver end. Thus, video over UDP has to take care of additional fragmentation and reassembly, and/or discovery of path MTU in the application layer.&lt;br /&gt;&lt;br /&gt;Problem 5:&lt;br /&gt;The server may allow video over UDP as well as TCP from the clients, typically to support NAT and firewall traversal. If some clients are over TCP and others over UDP, then the server also needs to proxy packets from one to other. If the client over TCP assumes ordered packet delivery, then the server will also need to do buffering, packet re-ordering and delay adjustment, which further adds to the implementation complexity of the server. The problem is not that visible for audio beyond a glitch in sound, whereas for video the view may get completely corrupted until the next Intra frame.&lt;br /&gt;&lt;br /&gt;Problem 6:&lt;br /&gt;A slightly related problem is when the conference server does audio mixing but video forwarding. In this case, the server must perform delay adjustment, packet re-ordering, and buffering for the audio path. However, for efficiency reason it may blindly forward the video packets among the participants. Thus the synchronization information between the audio and video gets lost, and performing lip synchronization at the receiving client becomes a challenge. A correct implementation of the server should act as an RTP mixer, i.e., include the contributing source information in the mixed audio stream, and distribute RTCP information to all that participants for synchronization. (How to do this if each audio call leg is a separate RTP session?)&lt;br /&gt;&lt;br /&gt;Some of these problems (2,3,5,6) can be solved to some extent by using peer-to-peer video conferencing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-7751283761643253156?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/7751283761643253156/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=7751283761643253156' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7751283761643253156'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/7751283761643253156'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/02/why-does-client-server-video-conference.html' title='Why does client-server video conference fail?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-1631237194507407398</id><published>2009-01-16T01:46:00.003-05:00</published><updated>2009-11-24T03:39:24.769-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='event based'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><category scheme='http://www.blogger.com/atom/ns#' term='Internet'/><title type='text'>Asynchronous Internet Programming</title><content type='html'>Programming an Internet application requires several asynchronous events such as network events, file I/O, timers and user interactions. In this article I walk through some existing design patterns for asynchronous programming.&lt;br /&gt;&lt;br /&gt;Most programmers think synchronously, i.e., in a single control flow. For example, in a web server implementation, a socket is opened for listening, and when an incoming connection is received it accepts the connection, receives the request, and responds to the client. There are several steps in this flow that can block, e.g., waiting for incoming connection or incoming request. If a resource such as thread or process is blocking, it causes inefficiency in the overall system design. For example, if one  thread is blocked serving a request from one client, another thread needs to be created to serve request from another client. Creating a thread-per-request causes too many thread creation and deletion in the system when the request-per-minute load increases.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Event handlers&lt;/span&gt;:&lt;br /&gt;The first approach is to break your program into smaller chunks such that each chuck is non-blocking. Then use a event queue to schedule these chunks. The earlier windows programming (WinProc) falls under this category. The main loop just listens for the events on an event queue. When a user input is received such as mouse movement, click or keyboard input, a event is dispatched to the event queue. The event handlers installed in the program handle the events, and may post additional events in the queue. External events such as socket input can also be linked to this event queue. One problem with this design is that several event handlers need to share state (hence global variables) which results in complex software.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; while not queue.empty():&lt;br /&gt;   msg = queue.remove(0)&lt;br /&gt;   dispatch(msg)   # calls the event handler for msg&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Some event-driven programming languages such as Flash ActionScript enforces this &lt;span style="font-style: italic;"&gt;event handler&lt;/span&gt; design practice coupled with object-oriented design. Typically, in ActionScript, individual object has event queue and can dispatch events, instead of exposing a global event queue to the programmer. An object can listen for events from another object, e.g., a controller program can listen for click event on the button, and handle it appropriately. Thus, there can be some data hiding, information separation, and better software design using event handlers. In other languages (C++) libraries such as &lt;a href="http://pocoproject.org/"&gt;Poco&lt;/a&gt; that facilitate event driven object oriented programming. Nevertheless, one problem with this design is that logically similar source code needs to be split across different functions and methods, and sometimes different classes. For example, a timer event handler defined as a separate function from the timer initialization code.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; button.addEventListener('click', clickHandler);&lt;br /&gt; ...&lt;br /&gt; function clickHandler(event:Event):void {&lt;br /&gt;   ...&lt;br /&gt; }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Inline Closures&lt;/span&gt;:&lt;br /&gt;Java as well as ActionScript solve this problem using inline closures. In particular, you can define event handlers inline within a method definition, thus keeping the logically similar code together in your program file. However, this is not always possible, e.g., if the same timer needs to be started from several places in the code. In this case, it does make sense to keep the timer handler as a separate method.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; timer.addEventListener('timer', function(event:TimerEvent):void {&lt;br /&gt;   ... // process timer event&lt;br /&gt; }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Deferred Object&lt;/span&gt;:&lt;br /&gt;&lt;a href="http://twistedmatrix.com/"&gt;Twisted Framework&lt;/a&gt; solves this problem using the Deferred Object pattern. The idea is as follows: instead of breaking the control flow source code every time a blocking operation occurs, the source code is broken when the result of the blocking operation is needed. This makes a lot of difference in programming, and makes the software cleaner. For example, earlier when a socket connection was initiated we needed to install an event handler for the success and failure result and perform the rest of the processing in those event handlers. Now, using the Deferred Object pattern, the &lt;span style="font-family:courier new;"&gt;socket.connect&lt;/span&gt; returns a deferred object, which is used in the same control flow as if it were a result of the &lt;span style="font-family:courier new;"&gt;connect&lt;/span&gt; method. The result object can be passed around elsewhere like a regular object. Only when we need to do something on completion of the connection, we wait on the &lt;span style="font-family:courier new;"&gt;result&lt;/span&gt; object'. This can be done either synchronously using &lt;span style="font-family:courier new;"&gt;result.wait()&lt;/span&gt; or asynchronously by installing a success and error callbacks. Provisions are there to allow one deferred object to wait on result of another deferred object before proceeding. An example follows:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; result = socket.connect(...)&lt;br /&gt; result.wait()&lt;br /&gt; if result:&lt;br /&gt;   ... # connection successful&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Co-operative multitasking&lt;/span&gt;:&lt;br /&gt;Python's &lt;a href="http://o2s.csail.mit.edu/o2s-wiki/multitask"&gt;multitask&lt;/a&gt; module allows cleaner source code by taking advantage of co-operative multitasking. The idea is to build application level threads of control which co-operate during blocking using the built-in &lt;span style="font-family:courier new;"&gt;yield&lt;/span&gt; method. A global scheduler runs in the main application. I find this to be the most clean design in my programming. An example follows.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt; data, addr = yield multitask.recvfrom(socket, ...)&lt;br /&gt; yield multitask.sendto('response', addr)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;One major problem with these approaches is that because of the inherent underlying single event queue software architecture, we cannot take advantage of multi-processor CPU architecture easily. Thus, we must use multi-threaded design in our software. Most of the earlier designs can be extended to multi-threaded application with some care. For example, one can run multiple global event loops listening on a single shared and locked event queue or one event-queue per thread where the dispatcher schedules it to the appropriate thread's queue. Clearly, the first one is more efficient from a queuing theory perspective.  Care must be taken to lock the critical section in event handlers, or control flow that may get accessed from different threads at the same time.&lt;br /&gt;&lt;br /&gt;Consequently, we get several multi-threaded designs: thread-per-request, thread-pool, two-stage thread-pool, etc. The performance comparison of these designs in the context of a SIP server is shown in my paper titled &lt;a href="http://www.cs.columbia.edu/%7Ekns10/publication/sipload-extended.pdf"&gt;Failover, Load Sharing and Server Architecture in SIP Telephony&lt;/a&gt;, Sec 6.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-1631237194507407398?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/1631237194507407398/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=1631237194507407398' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1631237194507407398'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/1631237194507407398'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/01/asynchronous-internet-programming.html' title='Asynchronous Internet Programming'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2798402305208893755</id><published>2009-01-15T13:23:00.004-05:00</published><updated>2009-11-24T03:40:02.940-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='ActionScript'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Tcl'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='C++'/><category scheme='http://www.blogger.com/atom/ns#' term='Python'/><title type='text'>Programming languages for implementing SIP</title><content type='html'>The programming language used for the implementation can affect the software architecture. For example, Flash ActionScript is a pure event-based language with no way of implementing a blocking operation. Hence, when a connection is made on a socket object, the socket object will dispatch the success or failure event. The caller installs the appropriate event handlers to continue the processing after the connection is completed.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;C&lt;/span&gt;&lt;br /&gt;There are two main reasons for implementing SIP in C: ability to compile on several platforms and very high performance. The primary advantage of implementing a SIP stack in C is that it can be easily ported and compiled on variety of platforms especially embedded platforms. Usually a C compiler is available for a platform, whereas others such as Java interpreter or C++ compiler may not be. Secondly, because there is no overhead (e.g., in terms of run-time environment and code size), the performance is usually the best. The main problem with implementing a SIP stack in C is the development time and cost of maintenance of the software. Finding bugs and adding a feature in a C program is usually more challenging than other languages. However if the software is well designed then the problem can be alleviated to a large extent. In any case, the number of lines of code that needs to be written in C is usually much more than the other high level languages such as Java or Python.&lt;br /&gt;&lt;br /&gt;The pjsip project presents an implementation of SIP and other related protocols. It has been used in a variety of real-world projects and has proven itself to be a good SIP implementation especially where performance matters or the resources are limited.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;C++&lt;/span&gt;&lt;br /&gt;The object oriented design allows better reusability and maintainability compared to programming in C. However, the number of lines of code is still large. If advanced C++ features such as standard template library (STL) are used then portability may become a concern for certain embedded platforms.&lt;br /&gt;&lt;br /&gt;One of my earlier SIP implementation was in C++ (and C) at Columbia University. Another example implementation is reSIProcate, an open source SIP stack.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Java&lt;/span&gt;&lt;br /&gt;The Java programming language is very popular among corporate world and enterprise application developers. The standards community has developed APIs that cover several aspects of SIP implementation and some of its extensions. This allows the application to be built against those APIs whereas the actual implementation of the SIP stack can be provided by several vendors. The main problem with an implementation written in Java is that it tends to be too verbose. Java on one hand gives the illusion of a very high level language, but on the other hand requires the programmer to write a lot of code even to do a small thing. Part of this is because of the way the language is defined -- in particular the exception handling and strict compiler enforced type checking. This requires the programmer to do a lot of typecasting and hence there is potential for run-time errors. Another problem with Java is that the run-time could become a memory hog.&lt;br /&gt;&lt;br /&gt;NIST SIP Stack provides the reference implementation of the JAIN SIP API.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Tcl&lt;/span&gt;&lt;br /&gt;The Tcl programming language is not as popular as the other high level languages such as perl, PHP or Java. However, because of the simplicity in the language construct it is very easy to learn. Unless the software is designed right, it becomes very difficult to maintain a large piece of software.&lt;br /&gt;&lt;br /&gt;Columbia’s SIP user agent, &lt;a href="http://www1.cs.columbia.edu/%7Exiaotaow/sipc/"&gt;sipc&lt;/a&gt;, is written in Tcl.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;ActionScript&lt;/span&gt;&lt;br /&gt;ActionScript (or ECMAScript 4), improves on the Java programming language by allowing much smaller source code size of the implementation and much shorter syntax for common operations. However, there are two major limitations: the platform is limited to Flash Player  or AIR (Adobe integrated run-time) and the language is purely event-based. The limitation of Flash Player may not seem important at the beginning, but prevents certain features. For example, the current version of Flash Player doesn’t have UDP or TLS sockets that could be used for SIP. Even the functions of a TCP socket is limited in that you can only initiate connections but cannot receive incoming connections. This prevents us from implementing a complete SIP stack in ActionScript without support from Flash Player or an additional plugin.  The Flash Player version for embedded devices is usually older than the current version which makes portability an issue. Because of the run-time overhead the performance is limited. The media codecs supplied by the Flash Player 9 or earlier are proprietary codecs that are usually not supported by any other implementation. This causes interoperability issues as well.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Python&lt;/span&gt;&lt;br /&gt;The object oriented nature, the compact coding style and very small source code size of the implementation makes Python a very good choice for implementing application protocols such as SIP. There is some overhead because it is an interpreted language; however the overhead is comparable to that of Java run-time. The interpreter is now usually available for embedded platforms as well, making it more portable than other languages such as ActionScript or C++.&lt;br /&gt;The biggest advantage of an implementation in Python is that the code size is drastically smaller than other languages. I have implemented the basic SIP stack in less than 2000 lines of Python source code. Compare this with the Java implementation of SIP which has more than 1000 files. The lower size means that not only the development time is smaller, but also the testing, code review and maintenance cost is much lower.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2798402305208893755?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2798402305208893755/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2798402305208893755' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2798402305208893755'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2798402305208893755'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/01/programming-languages-for-implementing.html' title='Programming languages for implementing SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-3129194377813111149</id><published>2009-01-15T07:44:00.012-05:00</published><updated>2009-11-24T03:40:42.354-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='Protocols'/><title type='text'>P2P API</title><content type='html'>In this article I describe several sets of APIs for P2P. For a software engineer, designing a good API is very important. A good abstraction of the underlying concept leads to a good API. For example, the data model of storing key-value pair in a P2P network is usually abstracted as a hash table. Programmers like seeing existing semantics in an API because they are already familiar with those semantics. This article uses API ideas from OpenDHT, JXTA, Adobe Flash Player, and IETF P2P-SIP draft.&lt;br /&gt;&lt;br /&gt;A good API&lt;br /&gt;&lt;ol&gt;&lt;li&gt;should take the form of use-cases&lt;/li&gt;&lt;li&gt;is very general and concise: does one thing and does it well&lt;/li&gt;&lt;li&gt;is self-explanatory and is similar to existing concepts, models, practices&lt;/li&gt;&lt;li&gt;is independent of implementation details&lt;/li&gt;&lt;/ol&gt;At the high level, there are three abstractions for P2P APIs: data storage, peer connectivity and group membership. There are several special cases among these abstractions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;span style="font-size:100%;"&gt;Data Storage&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;A distributed hash table (DHT) is a form of structured P2P network with data stored using hash table abstraction. In particular, it provides put, get and remove API methods. Programmers are familiar with container semantics of hash-tables. In Python, this looks like:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;a = DHT()&lt;br /&gt;a['key1'] = 'value1'&lt;br /&gt;print a['key1']&lt;br /&gt;del a['key1']&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/span&gt;Let us apply this to a use-case of P2P-SIP user location storage. In particular, the key is the user identifier of the form 'kundan@example.net' and value is the user location of the form 'kns10@192.1.2.3:5062'. With this we see several problems in the existing container semantics of the API listed above. Firstly, a user can have several locations, in which case a call request is sent to all those locations as in SIP forking proxy behavior. The following modification to the API causes more confusion because &lt;span style="font-family:courier new;"&gt;set&lt;/span&gt; takes a single value whereas &lt;span style="font-family:courier new;"&gt;get&lt;/span&gt; returns multiple.&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;a['kundan@example.net'] = 'kns10@192.1.2.3:5062'&lt;br /&gt;print a['henning@example.net']   # prints all contacts of henning&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;/span&gt;To solve this we can assume a 'set' semantics for &lt;span style="font-family:courier new;"&gt;a[k]&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;a['key1'] += 'value1'&lt;br /&gt;a['key2'] += 'value2'&lt;br /&gt;print a['key1']  # print a list of values&lt;br /&gt;a['key1'] -= 'value1' # remove specific key1-value1&lt;br /&gt;del a['key1']    # remove all values for key1&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;Another problem is that this API is not secure or authenticated. A secure DHT API based on public-key infrastructure can sign the stored value on &lt;span style="font-family:courier new;"&gt;set&lt;/span&gt; and &lt;span style="font-family:courier new;"&gt;del&lt;/span&gt;.&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;a = DHT(privatekey=...)   # supply the private key of the owner&lt;br /&gt;print a['key1']           # print all values for given key&lt;br /&gt;print a(owner=...)['key1']    # print values only by given owner&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;A third problem is that the API is not extensible to pure event-based languages such as Flash ActionScript. In ActionScript, there are no blocking operations, hence &lt;span style="font-family:courier new;"&gt;set&lt;/span&gt; and &lt;span style="font-family:courier new;"&gt;get&lt;/span&gt; much be defined asynchronously. ActionScript already has SharedObject abstraction to deal with remote shared storage of objects. This can be reused for the API as follows:&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;so = DistributedSharedObject.getRemote(privatekey=...)&lt;br /&gt;so.setProperty('key1', 'value1') # sets the key1-value1 pair&lt;br /&gt;so.retrieve('key2')        # initiates get&lt;br /&gt;so.addEventListener('sync', syncHandler)&lt;br /&gt;so.addEventListener('propertyChange', changeHandler)&lt;br /&gt;function syncHandler(event:SyncEvent):void {&lt;br /&gt;... # put is completed&lt;br /&gt;}&lt;br /&gt;function changeHandler(event:PropertyChangeEvent):void {&lt;br /&gt;if (event.property == 'key2') # get is completed&lt;br /&gt;  trace(so.data['key2'])&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;br /&gt;Another problem is that the API doesn't take into account the time-to-live of key-value pair. This can be solved by supplying a default timeout in the constructor.&lt;br /&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;d = DHT(privatekey=..., timeout=3600)   # default TTL of one hour&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;Note that a data storage API can be built on top of a routing and connectivity API. Since we would like to separate the  abstractions (data storage and peer connectivity/routing), we will define separate sets of APIs for these.&lt;br /&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;d = DHT(net=..., .privatekey=..., timeout=...) # use the given connectivity (net) object&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;span style="font-size:100%;"&gt;Peer connectivity and routing&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;The connectivity and routing layer deals with maintenance of P2P network. Programmers are familiar with &lt;span style="font-family:courier new;"&gt;socket&lt;/span&gt; abstraction and there has been effort to map P2P to socket abstraction [2].&lt;br /&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s0 = ServerSocket(...)  # a P2P node is created&lt;br /&gt;s0.bind(identity=..., credentials=....) # node joins the P2P network&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;The &lt;span style="font-family:courier new;"&gt;bind&lt;/span&gt; method is similar to the JXTA semantics, and similar to the &lt;span style="font-family:courier new;"&gt;attach&lt;/span&gt; method as proposed in IETF P2P-SIP work. Actual communication with a specific peer can happen over a connected socket. A connected socket is returned in the &lt;span style="font-family:courier new;"&gt;connect&lt;/span&gt; or &lt;span style="font-family:courier new;"&gt;accept&lt;/span&gt; API methods on the server-socket.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s1 = s0.connect(remote=...) # connect to the given remote identity&lt;br /&gt;s2, remote = s0.accept()    # receive an incoming connection from remote&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;The connection procedure takes care of negotiating connectivity checks using ICE (or similar algorithm) to allow NAT and firewall traversal.&lt;br /&gt;&lt;br /&gt;Once connected, the socket can be used to send or receive any message to the peer.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s1.send(data=..., timeout=...)&lt;br /&gt;data = s1.recv()&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;The original server-socket can be used to send or receive messages to specific peers without explicit connection.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s0.sendto(data=..., remote=...)  # send to specific peer&lt;br /&gt;data, remote = s0.recvfrom(timeout=...)&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;Sometimes, a node needs to send a message to any available node close to a given identifier. This can be implemented using overloaded sendto method that takes either a remote address or a key identifier. The latter performs P2P routing to the given destination key identifier.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s0.sendto(data=..., key=...)&lt;br /&gt;data, remote, local = s0.recvfrom(...)&lt;br /&gt;if local == s0.sockname:  # received for this node identifier&lt;br /&gt;...&lt;br /&gt;else:  # received to the given key presumably close this node&lt;br /&gt;key = local&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;A connected socket can be disconnected or a node removed from the P2P network using the close method.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s1.close()  # close the connection&lt;br /&gt;s0.close()  # remove the node&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;These abstractions allow building a distributed object location and routing APIs [1] using locality aware decentralized directory service.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;span style="font-size:100%;"&gt;Group membership&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;Group membership APIs are similar to the multicast and anycast socket APIs. A node can join or leave a group, and a message can be sent via multicast or anycast to one or more nodes in a group. Let us define a group identifier similar to that in JXTA. We extend the previous socket abstraction to create a new group.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s3 = s0.join(group=...)  # join a given group&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;The join method returns a semi-connected socket object which can be used to send or receive packets on the given group.&lt;br /&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span style="font-size:85%;"&gt;&lt;pre&gt;s3.send(data=...)        # send to every node in the group&lt;br /&gt;s3.sendany(data=...)     # send to at most one node in the group&lt;br /&gt;data, remote, key = s3.recv()  # receive a multicast or anycast data&lt;br /&gt;&lt;/pre&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;The socket API gives an intuitive abstraction to understand the P2P concepts. The actual implementation of the group membership may be more complex than simple routing and connectivity. Closing a group socket leaves the group membership.&lt;br /&gt;&lt;br /&gt;We have seen how a P2P API can evolve to accommodate various concepts into existing well known abstractions such as hash table and socket.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;span style="font-size:100%;"&gt;References:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:78%;"&gt;&lt;span style="font-size:85%;"&gt; 1. Towards a common API for structured peer-to-peer overlays (2003)&lt;a href="http://oceanstore.cs.berkeley.edu/publications/papers/pdf/iptps03-api.pdf"&gt; http://oceanstore.cs.berkeley.edu/publications/papers/pdf/iptps03-api.pdf&lt;/a&gt;&lt;br /&gt;2. The Socket API in JXTA 2.0&lt;a href="http://java.sun.com/developer/technicalArticles/Networking/jxta2.0/"&gt; http://java.sun.com/developer/technicalArticles/Networking/jxta2.0/&lt;/a&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-3129194377813111149?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/3129194377813111149/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=3129194377813111149' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3129194377813111149'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/3129194377813111149'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2009/01/p2p-api.html' title='P2P API'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-2921480865267130352</id><published>2008-09-29T04:07:00.005-04:00</published><updated>2009-11-24T03:41:23.091-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='39 Peers'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Python'/><title type='text'>Welcome to 39 peers!</title><content type='html'>I have launched an open-source project named "&lt;a href="http://www.39peers.net/"&gt;39 peers&lt;/a&gt;". From the web site:&lt;br /&gt;&lt;br /&gt;"The &lt;em&gt;39 Peers&lt;/em&gt; project aims at implementing an open-source peer-to-peer Internet telephony software using the Session Initiation Protocol (P2P-SIP) in the Python programming language. The software is still incomplete -- especially the P2P part. &lt;div class="contentpaneopen"&gt;&lt;div class="article-content"&gt;&lt;p&gt;Peer-to-peer systems inherently have high scalability, fault tolerance and robustness against catastrophic failures because there is no central server and the network self-organizes itself. Internet telephony can be an application of peer-to-peer architecture where the participants locate and communicate with each other without relying on expensive or managed service providers. &lt;em&gt;39 peers&lt;/em&gt; project is an attempt to provide a open source and free-for-all peer-to-peer network targeted towards open standards based real-time communication.&lt;/p&gt;&lt;p&gt;The &lt;em&gt;39 peers&lt;/em&gt; project is developed for student developers and researchers to experiment with new ideas. It is written in Python scripting language. It supports open protocols such as IETF SIP and RTP. It is licensed under GNU/GPL license."&lt;/p&gt;&lt;/div&gt;    &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-2921480865267130352?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/2921480865267130352/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=2921480865267130352' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2921480865267130352'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/2921480865267130352'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2008/09/welcome-to-39-peers.html' title='Welcome to 39 peers!'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-5688654934313207366</id><published>2008-06-05T14:41:00.003-04:00</published><updated>2009-11-24T03:42:01.664-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='memcached'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>memcached</title><content type='html'>Recently I got a chance to look at &lt;a href="http://www.danga.com/memcached/"&gt;memcached&lt;/a&gt; a distributed in-memory cache.&lt;br /&gt;&lt;br /&gt;There are two ways of scaling databases: (1) replicating with master-slave such that reads can be done from any slave database whereas writes needs to be done in master, and (2) distributing the data among segments such that accessing a data needs to access only a particular segment instead of the whole database. The first technique doesn't work well for scaling SIP-related contact data, because number of writes (login/logout, presence updates) are significant compared to number of reads (call routing, IM delivery). Distributing the data among segments has been done before, and my thesis also describes it in the context of SIP telephony.&lt;br /&gt;&lt;br /&gt;The memcached system uses this second technique, hence has much better performance in certain cases. It is targeted towards web applications that generate dynamic web content by accessing the database at the back-end. One could run memcached on several servers, potentially co-located with the web servers in the server farm. The dynamic web service (e.g., written in PHP) is then modified to first look in the cache, and if not found then query the actual database (and update the cache as well). The cache provides a distribute hash-table  (DHT) interface.&lt;br /&gt;&lt;br /&gt;When the data is updated in the cache, it gets stored in the appropriate memcached instance based on some hash of the data key. When a query is done, that instance is queried for the data. Each cache instance can be configured with a limit of memory size, typically 1-3GB based on the available memory in the system. The distributed nature of the cache allows you to linearly scale the cache by just adding more instances of the cache on different machines.&lt;br /&gt;&lt;br /&gt;At first glance this looks very promising for use in a P2P-SIP application. However, there are some serious limitation. Although memcached has built-in hashing mechanism for data storage, there is no automatic replication of data (hence no fail-over), the hashing needs to be implemented by the client itself (hence no recursive queries). In particular, memcached implements only one part of a full P2P data storage application such as OpenDHT. Because of this limitation, it cannot be used effectively for P2P-SIP client implementations without adding a lot of additional software. Nevertheless, memcached can potentially be used for P2P-SIP proxy server farm.&lt;br /&gt;&lt;br /&gt;As a new project idea, it would be interesting to try to use memcached in openser SIP proxy server to readily build a P2P SIP server farm without much overhead! If you are project student willing to work on this, I will be happy to mentor you.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-5688654934313207366?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/5688654934313207366/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=5688654934313207366' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5688654934313207366'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/5688654934313207366'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2008/06/memcached.html' title='memcached'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-9089080996152947175</id><published>2008-01-02T02:42:00.002-05:00</published><updated>2009-11-24T03:42:22.288-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Business'/><category scheme='http://www.blogger.com/atom/ns#' term='Open source'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Business out of P2P-SIP</title><content type='html'>Here are some of my thoughts on how to make (or save) money with P2P-SIP.&lt;br /&gt;&lt;br /&gt;There are two points to note: (1) an end-user doesn't know (or care) whether he communicates using client-server or peer-to-peer as long as he gets all the features (audio, video, text, conferencing, offline message, etc), security and speed. (2) unlike service infrastructure where the provider can make money out of user base (either charging per call or per minute  as in Skype-out or via advertisements as in youtube?), there is no solid business model for P2P-SIP. Since it is open protocol, anyone can develop a client and break free from the provider. This of it as a email client technology -- how many vendors make money selling email clients?&lt;br /&gt;&lt;br /&gt;There are some alternatives to this ideal view of P2P-SIP though. For example, one could control the user base -- keep the enrollment of the user controlled by the provider and allow communication among the clients of the same provider. This requires that the implementation be proprietary (at least the security and validation part), otherwise anyone may write an alternate implementation to bypass the validation. Thus the protocol doesn't remain open and is not P2P-SIP.&lt;br /&gt;&lt;br /&gt;Alternatively, one could build a server-farm of SIP proxies using P2P technology so that the maintenance cost of the servers is reduced. However, this still requires dedicated set of servers (electricity, bandwidth, call centers, etc), hence the cost saving is very low. Even though you may be using P2P-SIP you won't save much money.&lt;br /&gt;&lt;br /&gt;Thirdly, one could build and sell P2P-SIP software, e.g., for enterprises, that doesn't require expensive call managers or central servers. This works well, but has one problem. People have come to expect that communication, at least PC to PC, is free. Unless the client software provides tons of new features, which cannot be easily done by another vendor, selling such a client software is also difficult.&lt;br /&gt;&lt;br /&gt;So, how do you make money? You don't. Instead, you can save money.&lt;br /&gt;&lt;br /&gt;You can save money that you would otherwise be spend on maintaining the server infrastructure or you can enable new businesses that were earlier not possible because of close systems. Most of the current communication mechanisms have evolved in a tightly controlled environment, e.g., Mobile phones, Skype, Yahoo, etc.  Instead what we need is an open mechanism similar to emails and web. An email client is not tied to a single server. A web client (browser) is not tied to a single server or web site. A communication client should not be tied to a single provider.&lt;br /&gt;&lt;br /&gt;Consider an open system where anyone can call anyone else without paying to a provider beyond the Internet connection fees. This opens door for a whole new set of businesses. For example a small time vendor, consultant or topic expert can lend his service in real-time for a small fee without going through the hassle of setting up agreement with a provider. Existing systems such as social network sites or content distribution networks can have additional feature that allows a user to interact with his friends, or other users on the same site, or other users watching the same content, irrespective of the attachment of the user to the provider. There is a huge potential for innovations in the communication client space, similar to how web browsers have evolved.&lt;br /&gt;&lt;br /&gt;So, who can do this? I don't think it will come from existing businesses who have large user base -- this includes the likes of Yahoo, MSN, AIM, or Skype. The reason is that they wouldn't want to open up their users to others. I also don't think it will come from existing businesses that are targeted for selling advertisements -- the likes of Youtube, Google, etc.&lt;br /&gt;&lt;br /&gt;The question to ask is who benefits from ubiquitous P2P-SIP? Media companies and content owners should invest in making P2P-SIP ubiquitous so that their existing system is improved or becomes more appealing to end users. Such companies can probably sponsor open source projects for P2P-SIP. If there is a better and free communication infrastructure, all sorts of new innovations are possible. For example, people would like to talk with their friends while watching a TV series online, a cricket match online or a movie online.&lt;br /&gt;&lt;br /&gt;In conclusion, I feel that big companies that can benefit from real-time communication, but don't have an existing solution to work with, should invest in open-source P2P-SIP technologies. If you are interested in talking about it, feel free to drop me an email.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-9089080996152947175?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/9089080996152947175/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=9089080996152947175' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/9089080996152947175'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/9089080996152947175'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2008/01/business-out-of-p2p-sip.html' title='Business out of P2P-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-115107807539281499</id><published>2006-06-23T11:51:00.001-04:00</published><updated>2009-11-24T03:42:46.918-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Specification'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Data format and interface for external P2P mode</title><content type='html'>Last month, Henning and I submitted an internet-draft on "Data format and interface to an external peer-to-peer network for SIP location service". The &lt;a href="http://www1.cs.columbia.edu/~kns10/publication/draft-singh-p2p-sip-00.txt"&gt;text&lt;/a&gt; and &lt;a href="http://www1.cs.columbia.edu/~kns10/publication/draft-singh-p2p-sip-00.html"&gt;HTML&lt;/a&gt; version are available. The idea is to use an interoperable XML format for accessing the DHT (which is based on OpenDHT's interface) and an XML format for storing user location data such as contact address, cryptographic keys and certificates, offline messages, presence watcher information, etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-115107807539281499?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/115107807539281499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=115107807539281499' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/115107807539281499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/115107807539281499'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/06/data-format-and-interface-for-external.html' title='Data format and interface for external P2P mode'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-115107048040485886</id><published>2006-06-23T09:26:00.001-04:00</published><updated>2009-11-24T03:43:29.680-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Unstructured'/><category scheme='http://www.blogger.com/atom/ns#' term='Kazaa'/><category scheme='http://www.blogger.com/atom/ns#' term='Bamboo'/><category scheme='http://www.blogger.com/atom/ns#' term='Chord'/><category scheme='http://www.blogger.com/atom/ns#' term='Structured'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Skype'/><title type='text'>Structured vs unstructured P2P or Why we chose DHT for P2P-SIP?</title><content type='html'>One of the questions that people ask me is 'whether structured or unstructured P2P works better for SIP-based Internet telephony?'. My answer is that structured P2P such as distributed hash table (DHT) works better because of the following reason.&lt;br /&gt;&lt;br /&gt;Unstructured P2P networks usually rely on caching and replication to improve the lookup performance and reliability. In the case of file-sharing application, once the file named "matrix-II.mpg" is uploaded in a P2P network such as Kazaa, the content of the file remains the same (un-mutable). Thus, for reliability the file may get replicated at multiple places. Intermediate nodes in the download path may cache the file locally to improve download performance by other downstream nodes. On the other hand, a phone's contact location or IP address uploaded under the resource named "bob@home.com" may change often (i.e., mutable), e.g., if bob changes his device or the device's IP address is changed. This makes all the randomly replicated and cached copy of the resource content useless. Thus, random caching and replication is not useful.&lt;br /&gt;&lt;br /&gt;In unstructured P2P network, any replica of the file "matrix-II.mpg" is good enough, whereas in Internet telephony you want to reach only "bob@home.com" but not some replica of Bob. You can not replicate Bob's phone. The only thing that can be replicated is the mapping between Bob's identifier and his phone's IP address. However, to allow frequent updates and consistent view of the mapping, it is desired that the mapping is replicated in a structured manner so that both Bob can easily update the mapping and any prospective caller can easily query the up-to-date copy of this mapping. This means given the resource name, we can know where the resource should be present in the P2P network. Thus, it is structured.&lt;br /&gt;&lt;br /&gt;Unstructured P2P is suitable for random or blind search and allows wild-card search. For example, you can search for a file name containing "matrix" keyword. On the other hand, in Internet telephony most calls are made to the known destination (phone number or destination user identifier). Hence, hash table interface that allows lookup based on a key is good enough. Distributed hash table (DHT)-style P2P algorithms such as Pastry, Chord, Bamboo provide ideal platform for such lookups.&lt;br /&gt;&lt;br /&gt;Skype is believed to be unstructured among the super-node, but I think there may be some form of structure in how the data is stored among the super-nodes. Plus, there is no guarantee on upper bound on lookup cost which means it might be falling back to some centralized lookup if the number of hops in search exceed some limit. Since Skype is proprietary, we do not know how exactly it does lookups.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-115107048040485886?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/115107048040485886/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=115107048040485886' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/115107048040485886'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/115107048040485886'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/06/structured-vs-unstructured-p2p-or-why.html' title='Structured vs unstructured P2P or Why we chose DHT for P2P-SIP?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-114142052829454928</id><published>2006-03-03T15:44:00.001-05:00</published><updated>2009-11-24T03:43:48.017-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Specification'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>New internet-drafts</title><content type='html'>Last week, there were some new I-Ds on p2p-sip.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ietf.org/internet-drafts/draft-matthews-p2psip-nats-and-overlays-00.txt"&gt;draft-matthews-p2psip-nats-and-overlays-00.txt&lt;/a&gt;&lt;br /&gt;The draft analyzes various issues with NAT in the context of P2P-SIP and presents some important conclusions: (1) partial mesh of peers is more suited than ring or full mesh connections, for example; (2) connections can remain mostly static for maintenance of p2p; (3) structured p2p is more suited because of bounded connection count; (4) symmetric P2P (such as Kademlia?) is more suited because the connections across NAT are bidirectional; and (5) use of superpeers where all peers behind a common NAT can elect a small number of superpeers which handle connections across the NAT. The basic idea is to understand the NAT limitations such as number of pinholes allowed and timeout, and apply these for a deployable architecture.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.comet.columbia.edu/~eunsoo/ietf/draft-shim-sipping-p2p-arch-00.txt"&gt;draft-shim-sipping-p2p-arch-00.txt&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;The basic idea is to use the SIP-using-P2P architecture. There is some overlap with my 'external DHT' technical report. Some points of contentions are: the draft proposes P2P lookup first, and then fall back to DNS (RFC3263) which has a negative effect on performance since P2P lookup is slower. The format of the record can reuse the SIP Contact header instead of specifying individual terms such as TCP80, UDP5060, etc. The login server is clearly undesirable. But the draft is a good starting point. We will need more maturity in areas of security, authentication, and data format.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ietf.org/internet-drafts/draft-mayrhofer-enum-domainkeys-00.txt"&gt;draft-mayrhofer-enum-domainkeys-00.txt&lt;/a&gt;&lt;br /&gt;It proposes to use telephone numbers instead of email like user@domain format because one can validate that the number owner actually owns the telephone. The main problem is that phone numbers are too rigid and tied to physical port or line, thus we cannot have identifiers for virtual users or devices, e.g., sip:lamp@cs.columbia.edu belongs to an electric X10 controlled lamp in our lab. Secondly, phone numbers are hard to remember, thus needs address book. Thus it doesn't make much difference whether we use phone number or email as long as we know the remote's public key. The main problem to be solved is the distribution of public key in p2p. A certificate-based approach can be used instead.  &lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ietf.org/internet-drafts/draft-quittek-p2p-sip-middlebox-00.txt"&gt;draft-quittek-p2p-sip-middlebox-00.txt&lt;/a&gt;&lt;br /&gt;It tries to list various requirements for NAT traversal in P2P-SIP. Not much new value. Ignores some of the existing NAT traversal schemes such as ICE.&lt;br /&gt;&lt;br /&gt;In conclusion, there are still a lot of open issues in P2P-SIP, most notably in security, authentication and NAT/firewall traversal. Any solution for P2P-SIP must reuse the SIP's NAT traversal schemes such as ICE. There is a new version of &lt;a href="http://www.ubiquitysoftware.com/ietf/draft-ietf-sipping-nat-scenarios-04.txt"&gt;best practices draft&lt;/a&gt; for NAT traversal in SIP.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-114142052829454928?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/114142052829454928/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=114142052829454928' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114142052829454928'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114142052829454928'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/03/new-internet-drafts.html' title='New internet-drafts'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-114078565457493732</id><published>2006-02-24T07:34:00.001-05:00</published><updated>2009-11-24T03:44:14.009-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Kazaa'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Skype'/><title type='text'>Thunderbird as a deployment vehicle</title><content type='html'>One of the reasons for success of Skype is that they already had a deployment vehicle in the form of KaZaA file sharing network. I believe that for mass deployment of any P2P VoIP tool including open standard based P2P-SIP we need free software availability and a deployment vehicle.&lt;br /&gt;&lt;br /&gt;Deployment vehicle is just a mechanism or tool that already exists and is widely used, e.g., tools for email, web, file sharing, or something more basic as Windows OS. For example, someone can add P2P-SIP support in popular email clients such as Microsoft Outlook Express or Mozilla Thunderbird. The latter is easier since it is open source. &lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.columbia.edu/IRT/students/spring2006/"&gt;Project description&lt;/a&gt;: the Thunderbird client can be extended to also serve as a (peer-to-peer) voice and IM client. IM buddies would appear in the Thunderbird address book, allowing to display buddies and their online status. IM transcripts might be saved and searched in the same way that messages are saved. This work can leverage a SIP project, http://www.croczilla.com/zap/, and might build on the Cockatoo project at http://cockatoo.mozdev.org.&lt;br /&gt;&lt;br /&gt;One issue is that not many people use Thunderbird compared to Microsoft product (although I believe that will change in future). Alternatively, a web interface similar to Gmail and google-talk can be provided that can be easily accessed from any web browser. The challenge is to use P2P instead of central web server.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-114078565457493732?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/114078565457493732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=114078565457493732' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114078565457493732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114078565457493732'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/02/thunderbird-as-deployment-vehicle.html' title='Thunderbird as a deployment vehicle'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-114078416178466361</id><published>2006-02-24T07:03:00.001-05:00</published><updated>2009-11-24T03:44:36.321-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='authentication'/><title type='text'>Email based identity in P2P-SIP</title><content type='html'>Our p2p-sip architecture (see the previous 'external DHT' post) suggests using email address as user identity for SIP. In particular, our web CGI script (as a certifying authority, CA) generates the user certificate verifying that the user owns the identifier, say bob@example.net, and send this to the email address bob@example.net. &lt;br /&gt;&lt;br /&gt;This serves two purposes: (1) &lt;span style="font-weight:bold;"&gt;no login server&lt;/span&gt;: unlike Yahoo/MSN/Skype that need to connect to 'login server' on every login attempt, our p2p-sip uses the (central) CA only for first time use. The user certificate also gets stored in the P2P network, thus no need to connect to the 'login server' again. (2) &lt;span style="font-weight:bold;"&gt;any identity provider&lt;/span&gt;: unlike Yahoo/MSN/Skype/G-Talk that tie the identifier to a particular provider (i.e., @yahoo.com for Yahoo users), we can allow any user identifier as long as the identifier belongs to him.&lt;br /&gt;&lt;br /&gt;To avoid use of a single identity (service) provider, one option is to use user's email address as his SIP user id. Thus, identity verification just involves making sure that his user id is a valid email address that belongs to him, and that user has a certificate that proves this.&lt;br /&gt;&lt;br /&gt;Our web CGI script generates user certificate, where the user public key and certificate request is supplied by the user (automatically by sipc, on first time use). But to make sure that the user id is his email address, the certificate is sent in the email. This prevents you from using user id "bill@microsoft.com" because you can't get a certificate from our CA unless you own this email address. But if you have an email address say "bob@yahoo.com" or "Robert@msn.net" you can use this as your SIP identifier in p2p-sip, thus not tied to a single provider.&lt;br /&gt;&lt;br /&gt;Sending in email is just one of the ways. Alternatively, if a group of users already have user certificates from other trusted entity such as verisign, they don't need to do email based certificates. Another possibility for future work is to also allow 'tel:' identity if the user can call from that telephone number (with caller id) to our VoiceXML service script that verifies that the user owns this telephone number and issues a new certificate (by directly storing in the DHT). This way other friends who know his phone number instead of email address, can also reach him on p2p-sip. Making an outbound call to tel: (similar to sending outbound email) for identity verification is probably not a good idea, unless user somehow pays for the call.&lt;br /&gt;&lt;br /&gt;Identity protection (no one else can steal your identity) and verification (others can verify that you own this identity) is just one part of the P2P-SIP security. And using an email-based identity moves the problem of identity issuance and protection from P2P-SIP to your email provider.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-114078416178466361?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/114078416178466361/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=114078416178466361' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114078416178466361'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114078416178466361'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/02/email-based-identity-in-p2p-sip.html' title='Email based identity in P2P-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-114063948274493479</id><published>2006-02-22T15:09:00.001-05:00</published><updated>2009-11-24T03:45:15.997-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='Specification'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>Using an external DHT as a SIP location service</title><content type='html'>In Dec last year, I added support for an external DHT in my SIPpeer application, i.e., the SIP-using-P2P model. A &lt;a href="http://mice.cs.columbia.edu/getTechreport.php?techreportID=388"&gt;technical report&lt;/a&gt; describes the details. In particular I used &lt;a href="http://www.opendht.org"&gt;OpenDHT&lt;/a&gt; and built signing and encryption on top of it to allow secure P2P SIP with identity protection. The advantage of using a managed external P2P network is that we can safely assume that OpenDHT nodes are trusted in the sense that they perform DHT routing correctly. Thus the security problem is partly solved, and the identity protection can be done by issuing a certificate to the user if he/she has an email address. &lt;br /&gt;&lt;br /&gt;Then in Jan and Feb this year, I wrote a Tcl based SIPpeer connector for OpenDHT, and with the help of Xiaotao Wu, added it to our SIP user agent, &lt;a href="http://www.cs.columbia.edu/IRT/sipc"&gt;sipc&lt;/a&gt;. We are trying to see if we can make it opensource or at least freely downloadable for non academic use also. But first I will need to fix some of the usability issues. I will keep the current status posted here on time to time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-114063948274493479?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/114063948274493479/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=114063948274493479' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114063948274493479'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/114063948274493479'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2006/02/using-external-dht-as-sip-location.html' title='Using an external DHT as a SIP location service'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-113388974412208328</id><published>2005-12-06T12:13:00.001-05:00</published><updated>2009-11-24T03:46:06.241-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SIP'/><category scheme='http://www.blogger.com/atom/ns#' term='DHT'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>SIP-using-P2P vs P2P-over-SIP</title><content type='html'>I recently saw the video of IETF 64,  P2P-SIP &lt;a href="http://www.p2psip.org/ietf-64/ietf-64-dwnotes.php"&gt;Ad hoc meeting&lt;/a&gt;. Some of the interesting points from the "SIP based vs DHT based approaches" are as follows: There are two components in P2P+SIP, namely Maintenance (of P2P/DHT) and Lookup (for sending SIP message). In the first SIP-based approach, both the components use SIP, whereas in DHT-based approach, both the components are done using the P2P protocol directly. The first approach just requires resolver-like library in the application, and usage of SIP is incidental. In the second, SIP is overloaded with additional functionality of maintaining the DHT/P2P network. &lt;br /&gt;&lt;br /&gt;An intermediate approach is much better, where the DHT maintenance is done using the DHT/P2P protocol (instead of overloading SIP with that), and the lookup is done using SIP (by sending the call request to the next hop node in the P2P network). This gives rise to a P2P-SIP proxies model where the SIP proxies use P2P algorithm to locate the user and proxy the SIP request to the next hop based on the P2P lookup. This differs slightly from the first approach in that this doesn't require the caller node to lookup the user on P2P network before sending the SIP call request. The caller uses its local structures (such as finger table for Chord), locates the next hop and proxies the call request to the next hop. The intermediate approach takes the advantage of SIP-based model, as well as preseves the simplicity of DHT-based approach.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-113388974412208328?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/113388974412208328/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=113388974412208328' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113388974412208328'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113388974412208328'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2005/12/sip-using-p2p-vs-p2p-over-sip.html' title='SIP-using-P2P vs P2P-over-SIP'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-113382405714235244</id><published>2005-12-05T17:53:00.001-05:00</published><updated>2009-11-24T03:46:22.999-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>2 P2P or Not 2 P2P?</title><content type='html'>I read an &lt;a href="http://www.eecs.harvard.edu/%7Emema/publications/iptps2004.pdf"&gt;interesting paper&lt;/a&gt; on whether P2P is appropriate for an application or not. Although the paper is based on file sharing work, it seems to suggest (see chart on page 8, and conclusions at the end) that P2P may not be appropriate for global Internet telephony kind of applications where participants do not trust each other and the resources are of low relevance (since individual phone number/user identifier in p2p-sip are not interesting to many peers; unlike popular files).  The reason being, too much overhead due to distrust, and over due to non-natural p2p network evolution if the resources are not popular.&lt;br /&gt;&lt;br /&gt;However, if the trust problem can be solved in p2p-sip, I believe that p2p-sip will be useful because it can provide serverless VoIP at different scale -- small organizations, to global Internet. The current skype model may fit as an appropriate application for p2p because the protocol is closed, which makes the different clients to trust each other to some extent (i.e., the application is not malicious).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-113382405714235244?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/113382405714235244/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=113382405714235244' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113382405714235244'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113382405714235244'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2005/12/2-p2p-or-not-2-p2p.html' title='2 P2P or Not 2 P2P?'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-13033340.post-113382233905821493</id><published>2005-12-05T17:22:00.001-05:00</published><updated>2009-11-24T03:46:44.094-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='P2P'/><category scheme='http://www.blogger.com/atom/ns#' term='Software'/><category scheme='http://www.blogger.com/atom/ns#' term='Document'/><category scheme='http://www.blogger.com/atom/ns#' term='P2P-SIP'/><title type='text'>First post: initial set of references</title><content type='html'>&lt;span style="color:navy;"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;The recent news is that I am back from a long stay in India of about five months. I decided to start a new blog to put together my findings and readings on P2P-SIP and VoIP in general. If you would like to be added a team member for this blogger please drop me an &lt;a href="mailto:kns10@cs.columbia.edu"&gt;email &lt;/a&gt;and I will add you!&lt;br /&gt;&lt;br /&gt;I will start with references to my work. You can see a brief overview of our P2P-SIP design as a &lt;a href="http://www1.cs.columbia.edu/%7Ekns10/publication/nyman04.pdf"&gt;short paper&lt;/a&gt;, a more detailed &lt;a href="http://www.cs.columbia.edu/%7Elibrary/TR-repository/reports/reports-2004/cucs-044-04.pdf"&gt;architecture document&lt;/a&gt; or an extended &lt;a href="http://www1.cs.columbia.edu/%7Ekns10/publication/sip-p2p-short.pdf"&gt;overview paper&lt;/a&gt;. I also have an implementation running on Linux. The &lt;a href="http://www1.cs.columbia.edu/%7Ekns10/publication/sip-p2p-design.pdf"&gt;implementation report&lt;/a&gt; describes the details of the implementation so that others can also implement similar system.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;There are many other papers on the &lt;a href="http://www.p2psip.org/"&gt;p2p-sip&lt;/a&gt; website, including the original SoSIMPLE technical report, and internet drafts by Alan Johnston, Nimcat Networks and College of William and Mary.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/13033340-113382233905821493?l=p2p-sip.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://p2p-sip.blogspot.com/feeds/113382233905821493/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=13033340&amp;postID=113382233905821493' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113382233905821493'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/13033340/posts/default/113382233905821493'/><link rel='alternate' type='text/html' href='http://p2p-sip.blogspot.com/2005/12/first-post-initial-set-of-references.html' title='First post: initial set of references'/><author><name>Kundan Singh</name><uri>http://www.blogger.com/profile/10408946176930869078</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='26' height='32' src='http://2.bp.blogspot.com/_j-OZz2I3T5A/SvcMYuHmmeI/AAAAAAAAAAQ/LcfPnZo1uqk/S220/kundansingh.jpg'/></author><thr:total>0</thr:total></entry></feed>
