Some littleS3 documentation!

November 26th, 2008

So, I promised some documentation “soon” for littleS3. That was 2 months ago. Well, I have finally made good. I have just published a “Getting Started” wiki page to the project site. So far, this document provides some background on the project components, how to deploy it to an application server, and what the configuration files “configure” (along with sample configuration files in the project download section).

I would still like to add some samples of how to use the system to create buckets, add objects, etc. This is very similar to the usage described in the Amazon S3 Developer Guide for the REST API, but there is a bit of a trick since you are using your own application server. In addition to the host name, you may need to include a context path (a servlet notion) to the REST URIs.

Google can sort

November 22nd, 2008

Google recently announced that they were able to sort 1 terabyte (TB) in 68 seconds using 1,000 computers. The previous record holder was 209 seconds on 910 computers. I was impressed by this because I recently read about MapReduce and have been studying some of Google’s papers about the Google File System. Google used both MapReduce and the Google File System to attain this sorting record. But, being Google, they thought that since they did 1 TB so successfully, why not try sorting 1 petabyte (PB). (A petabyte is a thousand terabytes.) Google was able to sort 1 PB in six hours and two minutes and used 4,000 computers.

Why does Google care about sorting? One reason may be that their primary revenue source is based on advertising. And they have vast access to massive amounts of data submitted by their end users in the form of search queries. The more efficient Google is at crunching this information, the better they can target their advertising to users, resulting in more revenue. And Google can use their data for other purposes too, like predicting flu outbreaks.

I have been very impressed by what I have been reading about MapReduce and the Google File system. These sorting results help prove how efficient their infrastructure is. I particulary like how they use commodity computers to achieve these results. I know that using multiple nodes can get tricky very quickly. But their techniques seem to be designed from the ground up to use multiple nodes. And with this mindset, they can more adequately manage and utilize their collective computing resources.

lightningtimer.net

November 13th, 2008

I found Simon Willison’s lightningtimer.net Javascript timer to be totally awesome. The app is written in Javascript, all on one page. It lets you specify a time, and then a big-font timer counts down. The timer takes up the whole page. It can give you a warning, the background turns pink, when the timer is almost up. When the timer reaches zero, the background turns red and 0:00 blinks. Take a look at the page source for documentation on how to use lightningtimer.net. He says that he needed something for a Lightning Talk and this is the result.

Looking at the code (like I said, the Javascript is all in the page source, so just “view source” in your browser), it is a really cool example of Javascript. I was amazed how simple and elegant Simon’s code was. As someone who definitely is a poor Javascript hack, it is nice to see some good Javascript. And, I can use it as a tea timer too! 🙂

Greasemonkey for Pearson Access

November 12th, 2008

During the day, I professionally work on a web application called Pearson Access. Today we determined that we needed a user administration process for one of our Pearson Access customers that would require them to always select a certain user role when creating a new user. The system doesn’t auto-check the role, the user will need to be trained to check it.

But, being a geek, that got me thinking. What if Greasemonkey could be used with a Greasemonkey script in Firefox to automatically check a role when creating a new user account. So, I wrote such a script: autoselectrole.user.js.

What I’m reading: locks!

October 10th, 2008

I have been reading some of the papers published by the Google engineers. It started with Bigtable: A Distributed Storage System for Structured Data. I am not sure how I started. The Official Google Blog posted a link announcing their new technology round series. I watched the “MapReduce” discussion, where the engineers talked about Bigtable and how it is used in MapReduce. This lead me to look for more information about Bigtable as I was looking for information on distributed “communication” techniques to enhance the littles3 implementation. (The current littles3 architecture is very simple and only supports one node. It works, but doesn’t do any cool things like scale storage or be fault tolerant.) I had heard Bigtable discussed in different technical blog settings, but I had no idea that there was a paper from 2 years ago that described the Bigtable system. (I guess I don’t read the technical CS journals like I should. I may have to become more active in IEEE.)

While reading the paper (I did find it very readable. Okay, I am a computer geek. Fair warning.) I noticed that Bigtable, which is a highly scallable distributed database (not relational), used a “lock service” called Chubby. What is a “lock service”? Well, the The Chubby Lock Service for Loosely-Coupled Distributed Systems paper will tell you. I am currently reading this paper. (Again, this is from 2006! Where have I been?) Mike Burrows, the author of The Chubby Lock Service for Loosely-Coupled Distributed Systems, sprinkles humor into a computer science paper discussing Paxos, “a family of protocols for solving consensus in a network of unreliable processors”. What I found interesting is how the “lock service” is used to share information in a highly distributed system. The Bigtable implementation is a client of the “lock service” and uses it to elect a leader; the leader is the node that aquires the lock–only one node will get the lock. The “lock service” can also store small amounts of information, like metadata or configuration information, that a client application can read from the “lock service”.

Next up is the paper Paxos Made Live – An Engineering Perspective. This paper provides some details on how the Google team implemented Chubby, some of the history of the previous implementation, and some of the issues that they discovered implementation the Paxos algorithm.

Together, these papers provide some details of how Google has implemented highly distributed systems. So far, the information about Paxos has been very enlightening. And I am impressed with the way in which a “lock service” is used to coordinate communication and direct cooperation in a automated distributed network. It seems that they have created simple building blocks that together work in sometimes unique ways to make a complex system.

littles3 version 2.1.0 released

September 30th, 2008

Version 2.1.0 of “littles3” has been released. The only component change in 2.1.0 is the littleS3-2.1.0.war. This version enhances the web application configuration. The “host” value can now include a token “$resolvedLocalHost$“. Example:

host=$resolvedLocalHost$:8080

The token “$resolvedLocalHost$” will be replaced the value of InetAddress.getLocalHost().getCanonicalHostName(). This may be handy if your application server isn’t bound to “localhost“.

littles3 version 2.0.0 released

September 25th, 2008

Version 2.0.0 of “littles3” has been released. This release restructures the project into modules: API, file system data module, and webapp. The file system module also includes support for metadata. Unfortunately, there isn’t any more documentation than before. So to get the system working, you would have to wade through the source code. But I will hopefully get some documentation created soon. 🙂

Sydney started kindergarten today

August 26th, 2008

Today was Sydney’s first day of kindergarten. We saw her off this morning on the school bus. The bus was a bit late. I bet that we were towards the end of the route. And I bet that it was late because every other parent did like us; take a picture of our child as she got on the bus.

She “ate” her lunch today…well, just the peanut butter sandwich. She didn’t eat the carrots or cheese. Though that was pretty good for her, considering that she had never eaten a peanut butter sandwich until a little over a month ago. (She likes peanut butter, but in a bowl and eating it with a spoon.)

Today was an early dismissal. Tomorrow is the first full day. We will see how that goes.

Roadmap: PocketMod for MouseCal

July 23rd, 2008

I have been a fan of the Hipster PDA for years now. I carry it everywhere I go. So when I created MouseCal, I had a desire to “integrate” it with the Hipster. I have been trying to figure out the best way to do this. One idea that I have is to implement a PocketMod with a calendar/agenda of events. I have some incentive to try and develop something soon, as I will be going to Disney World at the beginning of August. This will be a good oportunity to “field test” a PocketMod with MouseCal data.

Surprised at Starbucks, morning coffee was free

July 20th, 2008

Starbucks linking logoI was very surprised this morning at my local Starbucks. I stopped by the drive through window after church for my standard grande cappuccino. There was a car in front of me. When I got up to the window, I could see that there were a larger than normal number of customers inside the store. When the barista handed me my drink, I had my Starbucks card ready. But she said not to worry, it was on the house for having to wait so long. I said thank you very much.

I wonder if this is related to Starbucks refocusing on customer service and providing good coffee. I was impressed by the local store’s ability to allow a front-line employee who interacts with a customer the latitude to provide a free drink. This is the type of discretion can also be found at my favorite vacation spot, Walt Disney World. I didn’t complain about the wait, but the server provided a pleasant customer service experience for me. She was proactive. And it made me write this post. (Though that won’t get them a lot of free publicity. It is mostly my family who reads this, and they aren’t coffee drinkers.)