The amazing adventures of Doug Hughes

Archive for July, 2010

Ant4CFs Documentation and Ant 1.8

Ive been notified twice now that there are problems running Ant4CF on Ant 1.8 or later. I wanted to let those users of Ant4CF know that Im aware of these problems and will eventually update the project to work with Ant 1.8. My guess is that the project is being compiled against an older Ant API that isnt compatible with Ant 1.8. That said, a quick Google search on this didnt turn up any details. Ill take care of this before too long and let everyone know about it!

Also, a few people have had issues getting Ant4CF up and running. The usual culprit here is the config.properties file that users need to edit to point to the local installation of Ant. On Windows in particular this can be confusing. The cause of this is that the default path to Ant is /usr/bin/ant. On *nix OSes this is a path to an executable. But to a Windows user this looks like it might be a path to a directory. Ive updated the documentation to explain that on Windows this path is likely to be something more like c:/program files/ant/bin/ant.bat. Also, because there are spaces in the path, the path needs to be surrounded in quotes in the configuration file.

And thats it for Ant4CF news. But, let me ask you, do you use Ant4CF? What do you think of it? Any suggestions on what I could do with it next?

Ill Be At CFUnited After All!

It looks like the stars have aligned for me, after all and I will be attending this years (and the last ever) CFUnited. I managed to schedule a couple client meetings at the same time and that really helps justify the expense.

If youre going to be there too, please look me up! Why? Well, theres no exciting reason, honestly. Unlike past years, Im not giving anything away. Honestly, I just want to meet as many people as I can, hear what theyre working on, and learn as much has I possibly can. And, if Im lucky maybe Ill find a business opportunity or two.

Also, because I waited until the last moment, it seems that the Landsdown resort is booked up for Wednesday night. If anyone has some spare floor space or a couch I could crash on for one night, Id be greatly indebted to you! Thanks!

An Introduction To NodeJS, A JavaScript-Based Server

Recently, Ive been dipping my toes into languages and technologies I might not previously have. For the last ten years or so ColdFusion has been my bread and butter. Yes, Ive tinkered with other languages like Groovy, but now Im trying to make a more concerted effort to learn more about things I otherwise might not.

Recently Ive been teaching myself a little Ruby and Rails. A local buddy of mine, Bucky Schwarz brought the The Raleigh-area Ruby Brigade to my attention. Last night I attended one of their meetings anticipating learning more about Ruby. However, the topic was not on Ruby, but on NodeJS. The talk was given by Aaron Heckmann.

I recently heard about some developments around JavaScript based servers. Some people are making use of them because they can apparently handle insane loads. Ive heard of some applications handling thousands of connections a second. Last week I tried to Google around and see what I could find, with no luck. So, imagine my surprise when the Ruby meetup topic was about NodeJS, a JavaScript server.

Node JSNodeJS is the brainchild of Ryan Dahl and makes use of Googles open source V8 JavaScript engine. Because of this, NodeJS benefits from the recent competition between Firefox, Google, IE, and Opera on JavaScript engine performance. In other words, NodeJS is very fast.

Installing NodeJS was very simple for me, since Im on OS X. I cant say it will be quite that simple for those in the Windows world since you have to download and build the source code. This was as simple as running ./configure, make, make install. Once NodeJS was installed I was able to run a couple of simple sample applications without any problems.

Much to my surprise, I was able to make small changes to these sample applications because I know JavaScript! I was able to take the HTTP hello world application and change it slightly. The same was true for the network socket sample. Cool!

My initial impression with both of these samples were two-fold:

  1. Wow, that starts up fast! For the simple hello world-style applications there was literally no wait to start NodeJS.
  2. The basic sytax was simple and easy to read. All you need to do to start listening for HTTP requests is this:
var http = require('http');
http.createServer(function (req, res) {}

Now, there are some challenges with Node. In particular, its intended to be pretty low-level. Those of us more comfortable working with higher-level languages might find Node, perhaps, a bit cumbersome.

Thankfully there are a lot of handy modules available for Node. One of these is Express. Express is a small server-side JavaScript web development framework for NodeJS.

To install Express I had to first install Kiwi, which is yet another package management system. To install Kiwi, I simply cloned the Git repository locally using:

git clone http://github.com/visionmedia/kiwi.git

Next, I ran make install in the newly downloaded kiwi directory. For me, this just worked.

Once I had Kiwi I was able to install express by simply running:

kiwi -v install express

Now that I have everything I needed I was able to create copy and paste a quick Express hello world application:

var sys = require("sys"),
kiwi = require("kiwi"),
express = kiwi.require('express')

get('/', function(){
     this.redirect('/hello/world')
})

get('/hello/world', function(){
     return 'Hello World'
})

get('/goodbye/world', function(){
     return 'Goodbye World'
})

run()

In the example above, once you start node running this application, you can hit this application at http://localhost:3000. The default request will redirect you to /hello/world. You can also go to /goodbye/world.

The Express module has a range of other features for routing requests, rendering views, and more. Frankly, I havent taken it much beyond whats in this article yet.

Despite the relative youth of the NodeJS project, it seems to me like theres a lot of energy and excitement behind it. You only need to look at the Module list to realize there are a lot of people working very hard on this project. Just to highlight a few, there are modules to connect to various databases, including MySQL and MongoDB, and many others. There are also modules for logging, templating, testing, and much, much more.

Overall, NodeJS really seems worth trying out and experimenting with.

Dan Wilson Will Kick Ass at CFUnited

Last Thursday the local Triangle Area ColdFusion User Group (TACFUG) held a meeting where our very own Dan Wilson gave preview presentations of two of the four presentations hell be giving at CFUnited in the next week.

Dan Wilson Kicks AssThe first presentation he gave was Get the Lead Out – Practical Optimization. As with everything Dan does, this was a very good presentation. Dan started out the presentation with a set of slides asking which was faster between two sets of similar code. For example, he compared using cfif to cfswitch. I was really surprised by this, but it turned out this was a red herring and wasnt the real meat of the presentation. Dan gets into a lot of detail about holistically testing applications. In fact, he even taught me a few things I didnt know, and I like to think I know a thing or two about this stuff.

The second presentation has what may well be the best title Ive ever seen at a tech conference, Cache Me if You Can. Of course, that title doesnt make any sense to our Australian friend, Mark Mandel, but well just have to deal with that. In this presentation Dan talked about a whole range of assumptions many developers make about caching. He also makes some very interesting points about being sure that you know what youre caching and why. As usual, this was an excellent presentation as well.

Dan has a really easy and relaxed style of presentation that I really admire. If youve not heard him talk and youll be at CFUnited, then I highly recommend you catch his talks!

CFUnited 2010 This Is Your Last Chance!

I wanted to take a moment and let those developers in the ColdFusion community who havent already heard know that this is the last year for CFUnited. Sadly, Ive heard that its lost money the last few years and apparently its being shelved. Who knows if this will be a permanent thing or not, but its safe to assume it is.

CFUnited Conference What does this mean? Well, it means that if you havent gone before, that this is the last chance for you to come to the largest and most focused ColdFusion conference in the world (at least that I know of). Not only that, but there are also always presentations about related technologies like Flex, JQuery, and more.

Here are some of the reasons I think you should go:

  • I know most of the speakers and I can tell you that theyre all freakin smart. Some of them are downright prolific in their talent. All they want to do is share that information with you, the attendee. Furthermore, all of these people are very approachable and love to talk about ideas, problems, concepts, and more. Buy them a beer and youll have a smart friend for life! Could you ask for a better resource?
  • The networking opportunities are very good as well. For the last few years Alagad has sponsored CFUnited and I think weve done well meeting new potential clients and making our name and our development services known to the community. Clearly this is valuable for businesses. However, for the individual it can be even more valuable. We all know the market is very shaky right now. I assure you that the more people you know in the business the better your prospects are for finding new work, should you need to.
  • The reverse is true too. That is to say that if youre looking for ColdFusion talent, this is where to find them. The conference attendees are all passionate enough about ColdFusion to either convince there employers to send them or to pony up their own hard earned cash.
  • Theres no better place to find out about the latest and greatest news from the ColdFusion community. CFUnited is historically where Adobe makes major announcements or demos new and exciting technology. Go and get ahead of the curve.

You should also know that today, July 15th is the last day for early bird pricing. There are also special offers for those who are unemployed. Find more information about this here.

So, why am I writing this entry? Some of you in the community already know that I wont be at this CFUnited. Sadly, thats true. I simply cant make it this year. I really wish I could, but the stars are not aligned in my favor. I suppose Im hoping someone else will go in my place and get all the benefits Im sadly going to miss.

A Technical Pecha Kucha Con?

This year I was introduced to the Pecha Kucha presentation format. For those not familiar with it, Pecha Kucha (pronounced peh-cha koo-cha) originated in Japan and translates as the sound of conversation or chit-chat. A Pecha Kucha presentation lasts only 6 minutes and 40 seconds and is made up of 20 slides that automatically advance every 20 seconds.

I became aware of the Pecha Kucha format when Bob Silverberg, everyones favorite Canadian ColdFusion developer, volunteered to put on a Pecha Kucha BOF session at the CF.Objective() conference. Typically, Pecha Kucha sessions are about something the presenter is passionate about or deeply involved in. Ben Nadel talked about people-centric software design. Steve Withington talked about beer. There were a few other topics, though unsurprisingly, most were technical in nature. At the time, I was under quite a bit of stress and so I wrote my presentation on stress and how I manage it.

Personally, I thought the format was excellent. The crowd was really into the presentations and energetic, laughing and cheering at all the right places. Since then Ive given the presentation two other times with a similar feeling.

One of the things I like about the format is that it forces the presenter to be concise and get their point across as quickly, clearly, and efficiently as possible. Additionally, if I’m uninterested in a presentation I only have to wait about five minutes for it to be over.

Since giving these presentations Ive pondered what would happen if you crossed a technical conference with Pecha Kucha? Lets face it, theres way too much new information in the technology world to keep up with effectively. You could read blogs all day long and still not be up to date on the majority of whats new.

Consider that over the last few years weve had a renaissance of dynamic languages, new frameworks have emerged, software development approaches have changed, and more. In a nutshell, things are changing, and fast! As an example, I recently heard about the reemergence of server side JavaScript. Who would have thought?!

If we dont know whats new then were stagnating. For this reason Im seriously thinking about putting on a language agnostic tech conference. Ive loosely titled this Pecha Kucha Con. The idea is that the conference would be either one or two days with only one track. Each presentation would be 6 minutes and 40 seconds long on a topic that no other speaker would be talking about. The purpose of these talks would be to give the audience a small slice of information about this topic and just enough to get started researching it, if theyre interested.

Im thinking that in one day you would have approximately 27 presentations grouped in fours. So, for example, from 10:00 am to 10:30 youd have four presentations. There would be fifteen-minute breaks every 30 minutes for refreshments and networking. Add in a long lunch and morning and evening networking events and youve got a lot of opportunity to get introduced to a lot of things you wouldnt otherwise find out about. Furthermore, the technology agnostic aspect would hopefully create an opportunity for cross-pollination where maybe there isnt typically (like between, say, .NET and Erlang programmers).

I think theres a lot more that could go with this as well. For example, make it a multi-day conference. Or have multiple tracks perhaps for programming, management, design, etc. Perhaps this conference could be held both online and offline. For example, maybe there would be a venue in RTP, NC, Washington, DC, and Los Angeles, CA. Each on of these events could broadcast through Adobe Connect to each other and to the general public who might not be close to one of these areas. This would allow for very wide audience involvement and unique conference experience.

So, on the surface, do you think this sounds like an interesting idea? Do you have any additional ideas that might go along with this? Ive purchased the domain name pechakuchacon.com I suppose Ill see what sort of interest there is and decide if its worth the effort or not.

A Quick Overview of Graphing Databases

I recently started work on a new project which, to avoid getting into too many details, is a social media application with similarities to Twitter, Facebook, and FourSquare though it is not a clone of any of these!

This is currently a hobby project of mine that I think may have some future potential. As a result, Im allowing myself the freedom to experiment with technologies outside of what Id normally work with. On this specific project I plan to use Ruby On Rails 3 (currently in Beta) and deploy the final application to Heroku. (Side note: we really need something like Heroku in the ColdFusion world.)

Because this is a social application and many social applications make use of so-called NoSQL databases, I started researching these. My research began by talking to John Paul Ashenfelter at this years CF.objective(). John gave a talk entitled Say NO to SQL. Sadly, I missed this talk, but I sat down with John after the fact and talked about the various types of NoSQL databases and where they fit in.

It turns out that there are several types of NoSQL databases. These include column stores, key value stores, document stores, and graph databases. For this article Im going to ignore all except for graph databases.

So, what exactly is a graph database?

Lets start exploring this by looking at standard relational database systems. There are a number of well-established RDBMS such as MySQL, PostgreSQL, MSSQL, Oracle, etc. Chances are youre familiar with at least one of these. These types of systems store data in tables that are made up of columns. Each record in the database provides values for these columns. For referential integrity, some columns may reference a column in another table via a foreign key.

These foreign keys are the only way to relate data in a relational database. And, for the most part, through normalization, these systems can model essentially any data.

The problem isnt really in the modeling however, the problem is in how you get data out of the database. Consider a situation where youre modeling a social network. In this network I may have dozens of friends and you may have dozens of friends, and each of our friends may have their own dozens of friends. Invariably, some of these will be the same people.

Now, I ask you, how would you find out how I know you using SQL? How would you be able to figure out how I know Keven Bacon in SQL? Furthermore, how would you do this in any efficient manner? The answer is: not easily.

The fact of the matter is that despite the fact that you can model this data in relational databases, these systems are simply not optimized to query this type of information back out.

There are, however, alternatives. You guessed it, graph databases.

A graph database is a system that stores data in nodes that are connected to other nodes via edges. In most graph databases nodes and edges can have associated properties. Most graph databases allow for traversals between related nodes.

This image shows how you might model the information used in a social network.

Example of Graph Relationships

In the example above I have created five nodes to represent people in a social network. Ive also created relationships between them. The relationships would be the edges referenced above. Note that each node and reference has various properties. For example, you can see that I (Doug Hughes) am 32. You can also see that I know Joe Blow and Jim Bob. Of course, the graphing database can also store different types of objects such as products, etc.

One of what is supposed to be a defining characteristic of a graph is the ability to quickly traverse nodes. So, using an API provided by the graph database system I can quickly find out how I know John Doe. The answer is through our mutual friendships with Jim Bob (or through Jim and Belva). This is also useful for situations where you want to find common themes. For example, Amazon has a feature that shows what other customers that purchased a specific product also purchased.

There are variations between graph databases as well. For example, some use directional relationships and others use bidirectional relationships. The difference is that a directional relationship may not necessarily be reciprocal. For example, on twitter, I could follow you, but maybe you dont follow me. Bidirectional relationships are more like Facebook where if Im your friend, youre my friend. The example above would be bidirectional.

Because of the nature of graph databases, they are very fast for traversing nodes and finding related data. Im not entirely sure at this point where they break down. Ive read that theyre not as efficient for large-scale updates where you may be updating a lot of records at one time. Beyond this, your mileage may vary.

Ive done a lot of reading up on different graph database. The ones that stuck out to me were these:

Neo4J

Neo4J appears to be the most widely used graph database and is the one Ive spent the most time researching. Its available through a very restrictive AGPL license or commercially. It strikes me as very expensive to license.

Neo4J is an embedded directional graphing database written in Java. The FOSS version provides a JAR that you download and make use of in your application. Alternatively, there is also a stand-alone version that exposes a RESTful API.

There are a number of language bindings available, most of which use the REST API. There are however native JRuby bindings. Id be interested in this, expect for the fact that Heroku doesnt support JRuby.

Its my interpretation that Neo4J still needs a little baking to really be a good solution. For example, the REST API has no security built in. Anyone who can connect to the port that is exposed can add, update, or delete information in the database.

Neo4J handles scaling and redundancy similarly to other RDBMS. Specifically, the paid version allows you to somehow replicate data to hot-spare servers. If you need to shard your data across multiple servers you must manage it manually within your application.

From everything Ive read, Neo4J really seems to be the strongest graphing database. However, it has negatives in that Im not sure if the paid version differs any from the FOSS version. Documentation is pretty good, but seems to be lacking in some areas (specifically related to high availability).

Oh, its also apparently blindingly fast. However, I cant find any information on how the use of the REST API impacts performance.

If youre interested in experimenting with Neo4J in ColdFusion, I suggest you check out Brian Panullas blog entry entitled Using Neo4j Graph Databases With ColdFusion.

InfiniteGraph

InfiniteGraph describes itself as distributed database for web-scale systems. Currently in public beta, it is slated for release in late July 2010 (any time now, really).

This system is written in Java and supports server based, cloud based, and embedded use. The basis of this InfiniteGraph is that it can apparently see nearly linear performance scaling by the addition of additional servers.

I cant find where I read this, but my memory is telling me that InfiniteGraph uses bidirectional relationships.

InfiniteGraph will be a closed source, proprietary, for-fee product when it is released. They do have programs for free usage, but they seem to tie you to a specific hosting provider. It even looks like you need to pay for developer licenses.

This bears watching, but Im concerned about the licensing details and pricing. Furthermore, it might not be terribly easy to connect to from non-Java languages. A C# API is due in the next major release.

FlockDB

FlockDB is Twitters own graphing database. However, this is not quite what it seems to be. As the Twitter developer blog explains in the provided link, FlockDB was engineered as a specific solution to scalability problems Twitter was experiencing.

Behind the scenes FlockDB actually just uses MySQL. Additionally, despite the fact that it FlockDB is a graphing database, its not actually optimized for graph traversal. Instead, its very good at adjacency lists (whos following whom). Flock also allows for horizontal scaling, though this appears to be somewhat manual.

In the end, I honestly havent done much reading on this tool since it didnt really match what I was looking for in a graphing database.

Other Options

There are a number of other graphing database engines available, but most of them are fairly specialized or are pretty esoteric. Im not sure I would want to deploy a large scale system on any of the alternatives.

What am I Doing With My Social Application?

After doing quite a bit of research in this area and briefly experimenting with Neo4J, Ive actually elected not to go the route of using a graphing database for my project. The reason I made this decision is that all of the graphing database implementations I looked at were either immature, lacking in documentation, or were difficult to talk to from my chosen language.

Furthermore, you may remember that my hosting platform of choice is Heroku. Heroku actually runs in Amazons EC2 service that makes it easy for me to run my own EC2 servers to host my database server instances. However, in the end, Ive decided to simply use PostgreSQL which Heroku already supports.

I have to keep in mind that what Im building right now is really just a hobby application. I cant justify spending a ton of money experimenting with a database offering that isnt really required at this point. If, in the future I do reach a point where I need really, really, fast access to related data I may port over to using Neo4J. Only time will tell!

Tag Cloud