Map and Reduce in Coldfusion, a pratical example

 Tuesday May 24, 2011  ·  7284 views  ·  6 comments
A few weeks ago I started a project called Collections.cfc that brings iteration functions such as map(), reduce() and filter() to Coldfusion collections. JavaScript and a few other languages have had these features for some time so I felt that it would be fun to try and bring that behavior over into the CF world.

Traditionally, functions such as map() and reduce() have been limited to Arrays, but being that Coldfusion rocks hard \m/, I wanted to go one step further and extend the functionality to Structures as well.

This example makes use of both types of collections types (arrays and structs) and several Collections.cfc methods to find the most popular keywords used in the descriptions of the blogs that ColdfusionBloggers.org aggregates.

Obtain our dataset
The blog feed list from ColdfusionBloggers.ors is presented in OPML, which means we can easily extract out the descriptions from each blog item using xml functions. Extract the descriptions
Parsing the feed has given us an array of xml objects to work with, but ideally we only want the value of the 'description' element from each item. Using map() we can transform the collection into a lighter weight version of just the descriptions from each blog entry, an array of strings. Break down the descriptions
This next step could have been combined when we maped the collection before, but it segways nicely into an example of using forEach(). forEach() doesn't create or return another collection, it simply lets you take action on (or modify) each item in it. We will use forEach to further breakdown the descriptions into individual words. Flatten the Heirarchy
After breaking down the descriptions into individual words, we are left with an array of arrays collection. While that is ok, it will makes things a easier down the road to flatten out the hierarchy into a single dimension. This will result in a one dimensional array containing each of the words used in all the blog descriptions. Count each Word
Now that we have an array with all of the keywords from the description, we can reduce the collection to down to unique words and the number of times they appear. Ignore common words
To get meaningful results from our collection we can use the filter method to get rid of common words, articles and symbols. Filter will only keep those items in the collection that 'pass' the comparitor. Most Popular Word
Now we that we have the word counts, we can search the collection using max() to find the word with the highest frequency. We could sort the collection at this point and just pop off the first recrod, but Collections.cfc has a max() method which allows you to define what is deamed the 'largest'. Top 40 Popular Words
For an extra bit of fun, the following code will reduce the collection into an array so that it can be sorted, and then from that, display a list of the top 40 most popular keywords people have used to describe their blog. After running all of the code above, you will end up with the following:

Download the full code snippet here


Comments




  • The image you posted isn't showing up. I'd love to see the results.



  • I wonder if the overhead of Collection.cfc is worth using... especially CF10 is coming out with closure support. Hmm...



  • @Henry - I'm not so sure that the inclusion closures in the next release would enable the type of functionality that I included in the library. In my understanding, having closures would allow you to pass in anonymous functions instead of predefined ones, or allow you to execute against an object. I think you would still have to perform the work that Collections.cfc is doing to get the same functionality



  • Nice post, thanks. BTW, I was able to look at the image URL and guess the proper location of the missing image (there was an extra "/new" in there) that shows the results of your code sample:
    http://jalpino.com/images/blog/top40words.png



  • @Jamie, thanks for the drop on the missing image. I migrated over to a new host (VPS) and a lot of my posts got borked.



  • I cant download your file...
    http://www.jalpino.com/new/media/collections.zip

 

Leave Feedback


Name


Email
Email will not be displayed

Website
( Optional )

Feedback

Post your feedback, HTML will not be rendered, only plain text.


Security

Answer the math problem below.
= 
Subscribe
Receive emails when others submit comments