Arkanis Development

Styles

Simple Chat: the details

Published

This post describes the technical details of the design and implementation of the Simple Chat project I already wrote about. The idea for that kind of a chat lingered in my mind for some time now. What would an absolutely simple chat require and look like on the technical level? Well, about 20 lines of PHP and about 40 lines of JavaScript later I had an answer and a chat that doesn't need Flash, Java, a database or any other fancy stuff. In this post I will explain the basic workings behind it as well as the HTML, PHP and JavaScript code. If you're curious you can take a look at the example.

Basic idea

For small stuff (e.g. a chat for people who watch the live-stream of an event) only basic chat functionality is needed: sending a message and a list with messages from everyone. I already build such a chat during the first GamesDay project but it used a SQLite database back then and had it's troubles. Being on the simplicity trip lately I refined the concept and made everything work together nicely. So this is what I came up with:

So there is absolutely nothing overwhelmingly complex about this chat and every component involved contributes something to the functionality… even the webserver itself which is often forgotten in "dynamic" stuff like this. This design has several advantages:

However there are also some things to look out for:

HTML skeleton

But enough about theory, lets get started on the work. First we create a little HTML code we later on extend with PHP and JavaScript. Since this stuff it meant to be an example for you on how to build your own little chat when the time comes we'll only use the basic code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    <title>Simple chat</title>
</head>
<body>

<h1>Simple chat</h1>

<ul id="messages">
    <li>loading…</li>
</ul>

<form action="<?= htmlentities($_SERVER['PHP_SELF'], ENT_COMPAT, 'UTF-8'); ?>" method="post">
    <p>
        <input type="text" name="content" id="content" />
    </p>
    <p>
        <label>
            Name:
            <input type="text" name="name" id="name" value="Anonymous" />
        </label>
        <button type="submit">Send</button>
    </p>
</form>

</body>
</html>

It's a basic XHTML 1.0 strict page with a list (ul#messages) that will contain all messages as well as a form to write and send new messages. There are however some details:

With this HTML skeleton we have the basics in place. The message list will be updated by the polling requests and when the form is submitted we will kick off an POST request in the background sending it to the server.

Server side code

With the HTML code in place lets build the message buffer that stores the last 10 send messages in a JSON file. Before we dive into the code two things:

First the clients get the buffer every time it was modified, meaning that one or more messages have been added to the buffer since the last polling request. We could just append all messages of the buffer to the message list (ul#messages) but this would add old messages multiple times. So the clients need a way to know exactly which messages in the buffer are new.

This can be achieved by numbering all incoming messages (like an autoincrement key in a database). The client then only needs to remember the ID of the last message it added to the list and can ignore any messages in the buffer with an older ID. If the buffer contains no messages we simply start at an ID of 0.

Second in our PHP code we need to read the old buffer to get the 9 old messages and to calculate the next ID used for the new message. We then append the new message, remove any overflowing messages and then write the new buffer to the JSON file. Now this is a typical race condition where actually two things can go wrong. The well known lost update where some other thread reads the old message buffer before we could write down our new one, effectively overwriting our added message. However it's also possible that another thread tries to read the message buffer file while we're writing to it. In that case it will fail and this can look like an empty file, making it restart at an ID of 0 and effectively blocking all clients from updating (since all messages after that get lower IDs again and are therefore ignored). I didn't checked for any lost updates but I observed the second problem when a little test script put the chat under some load (about 50 simulated clients, each one posting a message randomly every 8 seconds).

If you couldn't follow every detail of that: it isn't a problem. Race conditions tend to be hard to understand. The bottom line however is that we need to lock the message buffer from the read until the write. Thanks to PHP this isn't hard but adds some code lines.

Now to the code itself:

<?php

$messages_buffer_file = 'messages.json';
$messages_buffer_size = 10;

if ( isset($_POST['content']) and isset($_POST['name']) )
{
    // Open an lock the message buffer
    $buffer = fopen($messages_buffer_file, 'r+b');
    flock($buffer, LOCK_EX);
    $buffer_data = stream_get_contents($buffer);
    
    // Append new message to the message buffer
    $messages = $buffer_data ? json_decode($buffer_data, true) : array();
    $next_id = (count($messages) > 0) ? $messages[count($messages) - 1]['id'] + 1 : 0;
    $messages[] = array('id' => $next_id, 'time' => time(), 'name' => $_POST['name'], 'content' => $_POST['content']);
    
    // Remove old messages
    if (count($messages) > $messages_buffer_size)
        $messages = array_slice($messages, count($messages) - $messages_buffer_size);
    
    // Rewrite and unlock the message file
    ftruncate($buffer, 0);
    rewind($buffer);
    fwrite($buffer, json_encode($messages));
    flock($buffer, LOCK_UN);
    fclose($buffer);
    
    // Append message to log file or omit it if you don't need it
    file_put_contents('chatlog.txt', strftime('%F %T') . "\t" . strtr($_POST['name'], "\t", ' ') . "\t" . strtr($_POST['content'], "\t", ' ') . "\n", FILE_APPEND);
    
    exit();
}

?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
…

First we check if we received data from our form (content and name fields were send via POST). If so we got a new message we will append to our message buffer. In exchange we kick out the older message if the buffer is already at its maximal size.

The code only reacts if we really got a new message and serves the HTML content after it on normal GET requests. When a new message comes in we open, lock and read the the current message buffer, append the new message, cut of any old messages to maintain the buffer size and overwrite the file with the new buffer. Now on to the interesting parts:

That code gives us a nice and small messages.json file looking like this:

[
    {"id": 0, "time": 1282167333, "name": "arkanis", "content": "hello world!"},
    {"id": 1, "time": 1282167335, "name": "tester", "content": "hello moon"},
    …
]

Spaces and line breaks were inserted for clarity. Usually everything will be in one line with no wasted spaces. The names of the keys are important because we will use them to access the message data with JavaScript on the client side.

A word on security

Also note that no real input validation is done on the server. Unfortunately I have to disappoint any paranoid reader, we will not do some overwhelmingly complex filtering but will just make sure that every incoming data is properly encoded when it's going out. json_encode is one of those steps and no data can break out of it. In the chat log we use tabs as field separators and therefore we replace any tabs in the name or content with spaces. We will do some further escaping on the client later to prevent XSS attacks.

Since we use proper escaping you should also disable PHPs Magic Quotes. It will only mess up the original data.

The magic in between: client side code

We will use the jQuery framework to make the JavaScript code more fun to write. But still this will be a little bit bigger bunch of lines than the PHP stuff. To not scare you away with one large code block I'll divide it into several small blocks, one for each purpose.

First the usual stuff when using jQuery: include the jQuery framework itself and then do something as soon as the DOM tree is ready (using $(document).ready()). Our first action as the new ruler of the client is to remove the "loading" list entry:

<head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    <title>Simple chat</title>
    <script type="text/javascript" src="jquery.js"></script>
    <script type="text/javascript">
        // <![CDATA[
        $(document).ready(function(){
            $('ul#messages > li').remove();

            // code to send a message goes here…

            // placeholder for polling code…
        });
        // ]]>
    </script>
</head>
<body>
Sending a message… or: say something

For this occasion we high jack the submit event of the message form to kick off an POST request in the background. We then insert a "pending" message into the message list to let the user know that we actually did something. "pending" because we send the message but not yet received new messages from the server. As soon as the next bunch of messages comes in we will remove this pending message.

$('form').submit(function(){
    var form = $(this);
    var name =  form.find("input[name='name']").val();
    var content =  form.find("input[name='content']").val();
    if (name == '' || content == '')
        return false;

    $.post(form.attr('action'), {'name': name, 'content': content}, function(data, status){
        $('<li class="pending" />').text(content).prepend($('<small />').text(name)).appendTo('ul#messages');
        $('ul#messages').scrollTop( $('ul#messages').get(0).scrollHeight );
        form.find("input[name='content']").val('').focus();
    });
    return false;
});

If name and content are blank we just stop. It might be a good idea to do something to hint the user that one or both of these fields are missing but since we have an default value for the name the user would have to deliberately clear the name filed. In that case the user can expect it to not work as when trying to send an empty message.

After kicking off the POST request we then stop the event processing with return false;. This will keep the page from being reloaded by the browser.

As soon as we know the POST request succeeded (our callback runs) we insert the pending message into the list. Actually we first build an li element with the class of pending and set the message content as it's text. Since we do this with the text() method it's clear this string can not contain other elements and therefore every HTML stuff is escaped automatically (jQuery actually inserts a textNode into the DOM tree and the browsers do the escaping them selfs). Into this li element we insert an small element with some additional information such as the users name which is also inserted as text. Now we got something like that:

<li class="pending">
    <small>Anonymous</small>
    An example message text
</li>

At the end of the line we use appendTo to… well, append the build li element to the message list.

After this we just scroll the list down to show the new message and clear the message text field of the form so the user can start writing the next message.

Receiving messages… or: hear something

As explained above we will ask the server every 2 seconds for the messages.json file and insert any new messages into the message list. To do this we first create a function that does our GET request and make sure it's called every two seconds:

var poll_for_new_messages = function(){
    $.ajax({url: 'messages.json', dataType: 'json', ifModified: true, timeout: 2000, success: function(messages, status){
        if (!messages)
            return;
        
        $('ul#messages > li.pending').remove();
        var last_message_id = $('ul#messages').data('last_message_id');
        if (last_message_id == null)
            last_message_id = -1;
        
        for(var i = 0; i < messages.length; i++)
        {
            var msg = messages[i];
            if (msg.id > last_message_id)
            {
                var date = new Date(msg.time * 1000);
                $('<li/>').text(msg.content).
                    prepend( $('<small />').text(date.getHours() + ':' + date.getMinutes() + ':' + date.getSeconds() + ' ' + msg.name) ).
                    appendTo('ul#messages');
                $('ul#messages').data('last_message_id', msg.id);
            }
        }
        
        $('ul#messages > li').slice(0, -50).remove();
        $('ul#messages').scrollTop( $('ul#messages').get(0).scrollHeight );
    }});
};

poll_for_new_messages();
setInterval(poll_for_new_messages, 2000);

There isn't much about the GET request itself. The ifModified: true parameter makes sure that we only get message data if the message data has actually been modified. We also set a timeout of 2 seconds because after that time we start a new GET request anyway.

The message handler itself is aborted if our incoming data (messages) is undefined. This happens when the data was not modified. In case we got new stuff the action begins:

And thats it for the basic functionality. Throw it at a PHP enabled webserver and create the messages.json file (don't forget: the webserver needs read and write permission). You basic own chat should now running along smoothly.

Some CSS for the eye

While the chat now already works perfectly it might look a bit strange. Because every website is styled differently I suggest you leave the styling of the chat to your own creativity. However if you just want a quick starting point take these lines of CSS:

<style type="text/css">
    ul#messages { overflow: auto; height: 15em; margin: 1em 0; padding: 0 3px; list-style: none; border: 1px solid gray; }
    ul#messages li { margin: 0.35em 0; padding: 0; }
    ul#messages li small { display: block; font-size: 0.59em; color: gray; }
    ul#messages li.pending { color: #aaa; }

    form { font-size: 1em; margin: 1em 0; padding: 0; }
    form p { position: relative; margin: 0.5em 0; padding: 0; }
    form p input { font-size: 1em; }
    form p input#name { width: 10em; }
    form p button { position: absolute; top: 0; right: -0.5em; }

    ul#messages, form p, input#content { width: 40em; }
</style>

These CSS rules are a save start, even for poor IE 6 users. The most important part is the overflow: auto property paired with a fixed height. This transforms the message list into a box with it's own scrollbars. Another little trick is to position the submit button on the right side of it's enclosing paragraph (using position: relative and position: absolute). There are many other ways to do this but when confronted with IE 6 it's on of the few ways without many "strange" side effects.

With this we're done writing any code and you should have something very similar to the example chat page. My congratulations if you really read this far. :)

About performance

While the main aspect of this chat is its simplicity you don't really understand a technology if you don't know when it breaks. To explore the locking I created a small Ruby script that puts some load on the server: test.rb

The number of clients and the URLs are hard coded so you have to modify the script for you own setup if you want to do some testing. However it's just a quick 5 minute script and not programmed in a good nor scalable way. It doesn't distribute the requests very well over time and instead a big bunch of requests flood the webserver every two seconds. However when examining the locking this was quite useful since this behavior stresses the locking quite heavily.

Performance wise I couldn't really test more than 150 clients. At that load the webserver (Apache with PHP) needed negligible CPU and IO on my development machine (an old Intel Core 2 E6300) but in the browser the time for one polling request went up to 200ms. However the test script was eating up all other CPU time.

With even more clients the Ruby script hit an expected "threading barrier" on my system. Even with 300 clients I didn't saw any messages with client IDs above about 150. I suppose the other 150 threads just starved and never came to run. I don't really know what was really going there but some experiments around keep-alive requests and a better test program might help. Also during that test a chat in one browser (Firefox 3.6 with Firebug) stopped working because all polling requests timed out.

The bottom line is: I don't really know the upper limit but the chat can take more load than such a simple thing will ever get. 150 users in one chat room is everything but a total mess. If you ever want to use a chat for something big please go for the real stuff like IRC anyway.

Further ideas

Depending on what you need three are many ways you can modify or extend the chat. To mention just a few:

However the goal of this project was to see what is not needed. It's all to easy to build something that explodes in complexity so you might want to think about a feature twice before you really add it. ;)

2 comments for this post

leave a new one

#1 by
Christophe

Hello, I'm really newby in php/javascript. I tryed to install simple chat.When I click on "send", the page become white, and I must reload it. The file "messages.json" seems OK, messages and names are recorded, but nothing is showing on simple chat… Can you help me please? Friendly, Christophe

#2 by
Stephan

Hi Christophe,

this happens if for some reason JavaScript is broken or disabled. Then the code that usually handles the form submission never gets executed. The browser uses the default bahaviour which is more or less to reload the page.

Make sure that your page doesn't contain JavaScript errors and that JavaScript is enabled. Best take a look into the console of the developer tools. All errors should be listed there. The error messages usually provide good hints to fix these errors.

Leave a new comment

Having thoughts on your mind about this stuff here? Want to tell me and the rest of the world your opinion? Write and post it right here. Be sure to check out the format help (focus the large text field) and give the preview button a try.

optional

Format help

Please us the following stuff to spice up your comment.

An empty line starts a new paragraph. ---- print "---- lines start/end code" ---- * List items start with a * or -

Just to keep your skill sharp and my comments clean.

or