#.think.in
learn.create.enjoy

The Mouse is Dead, Long Live the Keyboard

June 16, 2010 23:40 by tarn

First up a disclaimer, this is how I like to role. I enjoy it. I'm not saying you shouldn't use a mouse or I'll never use one again.

I spend a lot of time programming on my netbook development environment because I enjoy it. Without all the visual tooling I feel can focus on working with the machine, learning its many tongues and tricks. I've run ubuntu on it for about a year and have learned heaps about linux. There have been many languages, tools and skills I would not have otherwise been able to learn.

One thing I don't like about it is using the crappy mouse pad. It has long been a pain point, even doing trivial things. Recently I saw a tweet by @TheColonial listing some mostly unfamiliar names and adding #themouseisdead. This was exciting, I had recently adopted vim and was loving it.

Last weekend I was a bit under the weather and not thinking clearly enough to write code. So I used to time to set-up and learn a few new tools. I put together a powerful, mouse-less development environment on my tiny netbook, which I think is pretty fun to use.

xmonad

xmonad is a great window manager (~1000 lines of haskell) which seems to plug into most X systems. There are other ways of managing a windowed environment, I never new existed! It is all keyboard driven, it works out non-overlapping window layouts, you can move between multiple desktops and move windows between them.

Here are some typical tasks in a desktop with a few windows using xmonad:

  • Switch between window layouts, generated by haskell algorithms (mod-space).
  • Select between windows on screen (mod-tab).
  • Throw the selected window to another desktop (shift-mod-2).
  • Change to the desktop (mod-2).

On Ubuntu it installed with:

sudo apt-get install xmonad

Which installed its haskell dependencies and configured itself. It also added an xmonad session option on the login screen, I wanted to use this and keep the normal session intact.

When you start an xmonad session you are presented with a completely empty screen except for a background picture. You can start terminal session with some keys (mod-shift-enter), but a menu would be nice.

dmenu

dmenu is a generic dynamic menu for X systems, it allows menu items to be selected efficiently with a keyboard.

It installed with:

sudo apt-get install dwm-tools

This adds another session to the login screen, now anything can be opened from the xmonad session by bringing up dmenu (mod-p) and typing the first few letters of the desired application.

Vim

I'm surprised and a little annoyed I didn't start using it sooner really, it is a very powerful text editor. I still fumble through it but feel I'm learning to increase my productivity and reduced my frustration working with code.

sudo apt-get install vim

Now vim is available from the terminal. The vim-tutor is the best place to start, then there is a large ecosystem of plugins.

Vimperator

Vimperator is a Firefox extension that provides command from the keyboard, with vim idioms. It provides deep control of firefox, a basic browsing scenario might work like:

You can open a page in a new tab

:tabopen theage.com.au

Then to open a link to a story in a new tab

shift-f

This adds unique two digit numbers to all links. Now any link can be opened by typing the digits or the link text.

To go to the new tab:

gt

Once read the tab can be deleted

d

TTYtter

Anything typed that doesn't start with "/" is a tweet (the -verify and -slowpost options can help with potential problems that might cause) The forward slash is used to invoke commands like "whois", "replies", "reply", "dm", "follow", "thread" etc. It's built on curl, so we need that first.

apt-get install curl

Then download the perl script from the website, make it executable and move it to the /usr/bin folder.

I like everything about this app down to the ascii art in the menus. The dude has some pretty cool stuff including a network of gopher servers.

Vimium

Vimium is not as powerful as Vimperator, but provides some vim-like keyboard controls for Chrome. Easily installed as a chrome add-in.

Rock'n'roll time?

I can now manage windows over multiple virtual desktops. Use a fast, effective menu to opening applications. Have extensive control of firefox and most importantly, can tweet about it with TTYtter. I am excited and looking forward to working with some code.

My set-up still isn't right yet though; I need a status bar and decent start script (the plain xmonad session just a blank canvas, an internet connection, battery power monitor and a clock would be useful). I'm going to try dzen for this.

And my keyboard skills on the special keys is way worse than than I'd like it to be. The best way to fix that is to keep my hands on the keyboard, right?

Finally thanks to all the people responsible for the OSS platform, languages and many tools. They are a pleasure to use.


Tags: ,
Categories:
Comments (3)

Revisiting dragging and inertia with RxJs

June 7, 2010 19:54 by tarn

I've been enjoying spending some time playing with the Reactive Extensions for Javascript and wanted to write about it before I lost my netbook in bar. Sadly this blog makes it difficult present the content and examples together, so before fixing that, I put it here

 


Tags:
Categories:
Comments (0)

Map-Reduce on Mongo

May 12, 2010 19:00 by tarn

I'm doing a presentation on non-relational databases at DDD Melbourne this weekend where I am going to demonstrate a map-reduce example with MongoDB and server side Javascript. I've been interested in both independently recently and it's been fun getting them to working together with some Javascript TDD to boot.

I needed a good example to demonstrate map-reduce and decided finding word occurrences across a series of documented seemed a simple enough scenario that is suited to being solved by a map-reduce query.

Below is an example of how we might solve this in plain C#

using System;
using System.Collections.Generic;

using System.Linq;
using System.Text;

class wordCounts {

    static void Main(string[] args) {

        // Setup some data

        List<string> lines = new List<string>() 
            { 
              "Peter Piper picked a peck of pickled peppers", 
              "A peck of pickled peppers Peter Piper picked",
              "If Peter Piper picked a peck of pickled peppers",
              "Where's the peck of pickled peppers Peter Piper picked?"

            };

        // select all words, group, count
        var wordCounts = lines.SelectMany(m => m.Split())
                              .GroupBy(m => m.ToLower())
                              .Select(m => new KeyValuePair<string, int>(m.Key, m.Count()));

        // Print out the results

        foreach (var wordCount in wordCounts) 
        {
            Console.WriteLine(string.Format("{0} {1}", wordCount.Value, wordCount.Key));     
        }
    }
}

This prints each different word in all the lines and the number of the times it occurs. The collection of strings is isomorphic to a collection of documents in the MongoDB for this example.

The SelectMany flattens lists of words from each line to a single list of words and the Group provides keys for each word, this is very similar to what the map function in the map-reduce query does.

The Select function is similar to the reduce function, but as we will see some additional considerations need to be made to allow it to be distributed.

I saw a good diagram ayande published on his blog but I didn't understand why he had multiple instance of the same document being mapped.

I created my own low key diagram to help demonstrate how a functional map-reduce could be distributed. The diagram shows the initial items can be split in half and reduced completely independently. This is interesting as it means our query can be distributed, but it also means we have to handle reducing a little differently.

It's also worth noting that this example shows a balanced tree, but it could be unbalanced and even introduce some redundancy.

MongoDB allows clients to send JavaScript map and reduce functions that will get eval'd and run on the server. Here is the map function.

function wordMap() {

    // try find words in document text
    var words = this.text.match(/\w+/g);

    if (words === null) { 
        return;
    }

    // loop every word in the document 

    for (var i = 0; i < words.length; i++) {
        // emit every word, with count of one

        emit(words[i], { count : 1 });
    }

}

The misunderstood Javascript "this" will be the context from which the function is called. Mongo will call function each document in the collection we are querying, and we can call it from a test context. Unlike the SelectMany the map function doesn't return a list, instead it calls an emit function which it expects to be defined.

We can write unit tests for this function by calling the function from a test mock context, calling a mock emit function (using Javascript as our mocking framework, wow).

eval(loadFile("src/js/wordMap.js"));


var emit;
var results;
var context;

testCases(test,

    function setUp() {
        emit = function (key, value) { 
            results.push({ key : key, value : value });
        };
        context = { text : "", map : wordMap };
        results = []; 
    },

    function empty_string_emits_nothing() {
        context.text = "";
        context.map();
        assert.that(results.length, eq(0));
    },

    function single_word_emits_single_word() {
        context.text = "findme";
        context.map();
        assert.that(results.length, eq(1));
        assert.that(results[0].key, eq("findme"));
        assert.that(results[0].value.count, eq(1));
    },

    function two_different_words_emits_twice() {
        context.text = "for bar";
        context.map();
        assert.that(results.length, eq(2));
    },

    function two_same_words_emits_twice() {
        context.text = "test test";
        context.map();
        assert.that(results.length, eq(2));
    },

    function tearDown() {
    }
);


The reduce function must reduce a list of a chosen type to a single value of that same type; it must be transitive so it doesn't matter how the mapped items are grouped.

function wordReduce(key, values) {
        var total = 0;
        for (var i = 0; i < values.length; i++) {
            total += values[i].count;
        }
        return { count : total };
    }


Similarly we can test this method does exactly what we expect it to.

eval(loadFile("src/js/wordReduce.js"));

testCases(test,

    function reduce_one_items_returns_count_of_one() {
        var result = wordReduce("test", [{ count : 1 }]);
        assert.that(result.count, eq(1));
    },

    function reduce_multiple_items_returns_item_count() {
        var result = wordReduce("test", [{ count : 1 }, { count : 1 }, { count : 1 }]);
        assert.that(result.count, eq(3));
    },

    function reduce_sums_counts() {
        var result = wordReduce("test", [{ count : 2 }, { count : 3 }]);
        assert.that(result.count, eq(5));
    },

    function reduce_is_transitive() {
        var result = wordReduce("test", [{ count : 1 }].concat(
                        wordReduce("test", [{ count : 1 }, { count : 1 }]
                     ));
        assert.that(result.count, eq(3));
    }
);


I'm using Rhino to run the Javascript so I used RhinoUnit as a test runner as it also uses the JVM and runs as an ANT scriptdef task, the setup was pretty painless. Here are the relevant ANT script sections

<scriptdef name="rhinounit"

           src="lib/rhinoUnitAnt.js"
           language="javascript">
    <attribute name="options"/>
    <attribute name="ignoredglobalvars"/>

    <attribute name="haltOnFirstFailure"/>
    <attribute name="rhinoUnitUtilPath"/>
    <element name="fileset" type="fileset"/>

</scriptdef>

<target name="javascript-tests">
    <rhinounit options="{verbose:true, stackTrace:true}" 
               haltOnFirstFailure="false" 
               rhinoUnitUtilPath="lib/rhinoUnitUtil.js">

        <fileset dir="test">
            <include name="*.js"/>
        </fileset>
    </rhinounit>

</target>


The word count example recreated in Mongo using a Python client and passing the map/reduce functions to the server.

from pymongo import Connection;
from pymongo.code import Code;


# open connection and connect to 'ddd' database
connection = Connection()
db = connection.ddd

# remove any existing data
db.drop_collection("messages")


# insert some data
lines = open("data/peter_piper.txt").readlines();

for line in lines:
    db.messages.insert( { "text" : line } )


# load map and reduce functions
map = Code(open("src/js/wordMap.js","r").read())
reduce = Code(open("src/js/wordReduce.js","r").read())


# run the map-reduce query
result = db.messages.map_reduce(map, reduce)

# print the results    
for doc in result.find():
    print doc["value"]["count"],doc["_id"]


And it worked! I'd like to run the query on a larger result-set, but there isn't much point on this tiny low-spec'd netbook.


DevEvening NoSql/MongoDB Presentation

April 7, 2010 21:12 by tarn

The slides and demo I'll show in my NoSql presentation for tomorrows DevEvenings Melbourne ORM Smackdown. I hope to take the time and write a more considered post of my findings and opinions as I've found it very interesting.

Here is a link to the slides.

First I'll use some Python and the PyMongo module to connect to MongoDB, list databases, insert documents and get them out again.

Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pymongo import Connection
>>> connection = Connection()
>>> connection.database_names()
[u'files', u'working', u'demo', u'downloads', u'posts', u'local', u'admin']
>>> db = connection.demo
>>> import datetime
>>> db.messages.insert( { 'author' : 'tarn', 'date': datetime.now(), 'message' : 'Hello Mongo' } )
ObjectId('4bbc4beec73d721445000003')
>>> db.messages.find_one()
{u'date': datetime.datetime(2010, 4, 7, 19, 10, 6, 355000), u'message': u'Hello Mongo', 
u'_id': ObjectId('4bbc4beec73d721445000003'), u'author': u'tarn'}

Then have a look at some content I have in the database

>>> db = connection.working
>>> db.posts.count()
106
>>> for post in db.posts.find()[:5]:
...     print post["date"],post["title"],"by",post["author"]
... 
2010-01-25 18:06:00 Python Silverlight/Moonlight 2 Xapping by tarn
2010-03-12 13:53:00 Devevenings Presentation - IOC/Unit Testing/Mocking in ASP.NET MVC by tarn
2010-02-17 19:25:00 Revisiting Modal Binding an Interface, now with DictionaryAdapterFactory by tarn
2009-12-02 18:35:00 Creating Silverlight apps in the browser by tarn
2009-10-02 23:08:00 #.think.in infoDose #43 (11th September - 22nd September) by brodie

File storage using the GridFS class from the gridfs module. Show some files and then write a file out to the file system.

>>> from gridfs import GridFS
>>> fs = GridFS(connection.files)
>>> len(fs.list())
116
>>> for file in fs.list()[:5]:
...     print file
... 
post/debugging-ironpython-with-my-excalibur/image.png
post/debugging-ironpython-with-my-excalibur/image_thumb.png
post/devevenings-presentation---iocunit-testingmocking-in-asp.net-mvc/20102f32fdevevening_presentation.pptx
post/devevenings-presentation---iocunit-testingmocking-in-asp.net-mvc/20102f32fguestbook.zip
post/think.in-infodose-40-5th-august---16th-august/image.png
>>>
>>> with open('image.png','w') as out_file:
...     with fs.open('post/debugging-ironpython-with-my-excalibur/image.png') as in_file:
...             out_file.write(in_file.read())
...

Now to some C# (mono) and a controller class for a basic web application to view the data. It serves files found in the database, but only sends bach the correct MIME type for "image/png". Lazy.

public class HomeController : Controller
{
    BlogRepository _blogRepository;

    public HomeController()
    {
        _blogRepository = new BlogRepository(); 
    }

    public ActionResult Index ()
    {
        ViewData["posts"] = _blogRepository.GetPosts();
        return View ();
    }

    public ActionResult Entry(string id)
    {
        ViewData["post"] = _blogRepository.GetById(id);
        return View ();
    }

    public ActionResult Resource(string slug, string fileName)
    {
        return new FileStreamResult(_blogRepository.GetFile( "post/" + slug + "/" + fileName ), "image/png");
    }

}

This is a very basic repository that provides the data for the demo application. I did only enough with the C# provider to get it working and try to disconnect my connections.

public class BlogRepository
{
    Mongo _mongo;

    public BlogRepository()
    {
        string connstr = ConfigurationManager.AppSettings["connectionString"];
        _mongo = new Mongo(connstr);
    }

    public Stream GetFile(string name)
    {
        try
        {
            _mongo.Connect();
            var db = _mongo["files"];
            var fs = new GridFile(db);
            Stream data = fs.Open(name, FileMode.Open, FileAccess.Read);
            Stream output = new MemoryStream();
            CopyStream(data,output);
            output.Seek(0,SeekOrigin.Begin);
            return output;
        }
        finally
        {
            _mongo.Disconnect();    
        }
    }

    public List<Document> GetPosts()
    {
        try
        {
            _mongo.Connect(); 
            var db = _mongo["working"];
            var posts = db["posts"];
            using(ICursor all = posts.Find(new Document())){
                return all.Documents.ToList();
            }
        }
        finally
        {
            _mongo.Disconnect();
        }
    }

    public Document GetById(string id)
    {   
        try
        {
            _mongo.Connect();
            var db = _mongo["working"];
            var posts = db["posts"];
            Document doc = posts.FindOne( new Document() {{ "_id" , new Oid(id) }} );
            return doc;
        }
        finally
        {
            _mongo.Disconnect();    
        }
    }

    public static void CopyStream(Stream input, Stream output)
    {
        byte[] buffer = new byte[32768];
        while (true)
        {
            int read = input.Read (buffer, 0, buffer.Length);
            if (read <= 0)
                return;
            output.Write (buffer, 0, read);
        }
    }
}

So that's where I got for my demo for the DevEvenings ORM Smackdown. No doubt I will continue looking into MongoDB and other object/document databases.


Scraping this blog

March 18, 2010 23:31 by tarn

I have created a monster and this post is about killing it off by scraping the contents of this blog into structured Python objects. Sometime later I will convert the HTML content to markdown and download the images and other resources locally.

I want to put the contents into a ZODB object database to get a feel for working with object database. A greated goal is to migrate the content a new blog engine. I don't want to go into why I felt I need to scrape it or why I want to migrate to another blog engine as it's depressing.

Moving on, I wanted to put the content into these classes

class Post():
    title = ''

    content = ''
    date = ''
    tags = []
    comments = []


class Comment():
    content = ''
    author = ''
    date = ''

    website = ''

The scraping code is not elegant but was quite fun to write as I could write it all from an interactive console session. I found BeautifulSoup was fantastic in making HTML into something that was easy to work with, although I would have liked to have used jQuery/CSS style selectors.

from BeautifulSoup import BeautifulSoup

from datetime import datetime
import urllib2 
import re

def ParseComment(soup):
    comment = Comment()
    comment.author = soup.find('p',{"class":"author"}).first().string.strip
    content = soup.find('p',{"class":"content"})
    if content:
        comment.content = content.prettify()
    website = soup.find('p',{"class":"author"}).first()    
    if website.has_key('href'):
        comment.website = soup.find('p',{"class":"author"}).first()['href']
    r = re.compile('\d*/\d*/\d* \d*.\d*')    
    date = r.findall(soup.find('p',{"class":"date"}).renderContents())[0]
    comment.date = datetime.strptime(date,'%d/%m/%Y %H:%M')
    return comment


def ParsePost(postSoup):
    post = Post()    
    post.title =  postSoup.find('a',{"class":re.compile('posthead.*')}).string
    print post.title
    post.content = postSoup.find('div', {"class":"entry"})
    date = postSoup.find('div',{"class":"descr"}).contents[0][:-4]
    post.date = datetime.strptime(date,'%B %d, %Y %H:%M')
    post.author = postSoup.find('div',{"class":"descr"}).first().string
    post.tags = map(lambda x: x.string, postSoup('a',{"rel":"tag"}))
    comments = postSoup.find('div',{"id":"commentlist"})('div')
    post.comments = [ParseComment(commentSoup) for commentSoup in comments]
    return post


def DownloadPost(url):
    postHtml = urllib2.urlopen('http://blog.sharpthinking.com.au/' + url).read()
    postSoup = BeautifulSoup(postHtml)
    return ParsePost(postSoup);


def GetPosts():
    page = urllib2.urlopen("http://blog.sharpthinking.com.au/archive.aspx")
    soup = BeautifulSoup(page)
    postUrls = map(lambda x: x['href'], soup('a', href=re.compile('/post/.*')))
    return [DownloadPost(url) for url in postUrls[:-10]]



I'm sure there is better way, but this was better than any way I've used previously. Anyway I've done a lot of work untangling the mess I created.

>>> posts = GetPost()    
>>> for post in posts[:5]:

...     print post.date, post.title
...
2010-03-17 22:16:00 OMG. It's a JavaScript Rhino

2010-03-12 13:53:00 Devevenings Presentation - IOC/Unit Testing/Mocking in ASP.NET MVC

2010-02-20 17:18:00 Revisiting Pygments in the browser with Silverlight, now with BackgroundWorker

2010-02-17 19:25:00 Revisiting Modal Binding an Interface, now with DictionaryAdapterFactory
2010-02-16 20:34:00 Modal Binding an Interface with DynamicProxy


I wanted to put the contents into the object database tonight, but I have pickled it to be revisited later.


OMG. It's a JavaScript Rhino

March 17, 2010 22:16 by tarn

JavaScript is a slightly flawed language but it's got elegant parts too. All languages do to some degree, it's just JavaScript seems to have both in extremes. Whatever you think of it, history has made it the language for scripting the client-side web. It has become a mainstream language that shows no sign of falling off.

The excelent book JavaScript: The Good Parts by Douglas Crockford, working with the jQuery library and learning a little Lisp has lead me to really embrace JavaScript.

It's no secret that I like programming with interactive consoles and decided I wanted find out if there was an interactive console for JavaScript. A language that only lives inside web browser environment didn't seem right to me.

Rhino is a JavaScript implementation on the JVM. It has a compiler, a debugger and interactive console.

To get started you obviously need a version of the JVM. That's not to difficult. On Windows I just downloaded a Sun Java 6 installer. On my Ubuntu install I installed openjdk but found Rhino didn't work. So I installed sun-java6, which worked.

sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk

You can find what version you've installed by running

$ java -version
java version "1.6.0_15"
Java(TM) SE Runtime Environment (build 1.6.0_15-b03)
Java HotSpot(TM) Client VM (build 14.1-b02, mixed mode, sharing)

Excellent. The Rhino binaries includes js.jar which is needed for the console. Now should be able to run the jar

$ java -jar js.jar


When all goes well this will take you into the Rhino shell

Rhino 1.7 release 2 2009 03 22
js>

We can start playing with the language.

js> get_counter = function() { var counter = 0; return function() { print(counter); counter++; } };
..
js> counter1 = get_counter();
.. 
js> counter1();
0
js> counter1();
1
js> counter2 = get_counter();
..
js> counter2()
0
js> counter1()
2


Which is cool and there is some of the weirdness

js> '5' + 3
53
js> '5' - 2
3

And some interesting features

js> parseInt('06')
6
js> parseInt('08')
NaN
js> parseInt('10')
10
js> parseInt('010')
8


I'm looking forward to learning more about writing code in JavaScript.


Tags: , ,
Categories:
Comments (2)