A Simple Spark Structured Streaming Example

Recently, I had the opportunity to learn about Apache Spark, write a few batch jobs and run them on a pretty impressive cluster. The Spark cluster I had access to made working with large data sets responsive and even pleasant. We were processing terabytes of historical data interactively like it was nothing. The ease with which we could perform typical ETL tasks on such large data sets was impressive to me. [Read More]

Working with Relational Databases in R

On a recent project, we needed to connect to a MS SQL Server instance from R. Project members were mostly running some kind of Unix-like operating system. Connecting to a Microsoft product from R running on a Unix-like operating system was new to us. Documentation on how to do this was hard to find. This post describes how we configured our environment. Environment and Software Versions This post will use two virtual machines to explain how we configured our development environment. [Read More]

Bose SoundTouch Webservices API

Our family Christmas gift this year was a Bose SoundTouch 10 wireless music system. This thing is great. The setup was very simple. Using it on a daily basis cannot be more user friendly. This all in one system connects to the Internet and streams music like a boss. I like the 6 preset buttons and the ability to control which is playing with a remote or my iPhone. Often, I use the SoundTouch while watching movies, streaming the audio over Bluetooth. [Read More]

Working with JSON and PostgreSQL

Recently, I needed to scrape some JSON documents and do some pre-processing on the data. I had to link documents together and query the data in order to generate an input file for another process. I thought about using MongoDB to help with some of the pre-processing. However, I hadn’t used MongoDB in over a year. The thought of brushing up on it for a one-off task was not super exiting to me. [Read More]

HTTP Middleware Written in Go

Middleware, It’s Everywhere The PHP community has recently formalized the representation of HTTP messages in PSR-7. The hope is that agreeing on common interfaces for representing HTTP messages will lead to more interoperability between PHP frameworks. As a result, web development will be more like composing pieces of re-usable middleware into an application. This will be good for the PHP community. Some frameworks like the popular Slim micro framework are already providing PSR-7 support. [Read More]

Deploying a Server Written in Go on Linux

Deployment Context and Preferences Recently, I deployed a REST server written in Go. The target system is a Linux box running CentOS 7. The service is managed using systemd. Logrotate is configured to rotate the server’s logs on a daily basis. I couldn’t find any “how-to” that provided step-by-step instructions for configuring services written in Go. This is an attempt to document how to set up a simple service written in Go. [Read More]

A Poor Man's ETL Tool

Some days it seems that all we do is move data around. We take it from “here”, manipulate it, reformat it and send it over “there” for display or more analysis. ETL is short for for Extract, Transform and Load. There are lots tools, especially in data warehousing, that help us with these kinds of tasks. What you can accomplish with these tools using a drag and drop interface is truly amazing. [Read More]

REST Web Service in Go - Encapsulating a Worker Pool

I wanted to make performance improvements in a PHP application recently. One section of code was doing a lot of number crunching and performance was just not great. I looked at creating a PHP extension in C to do the calculations. This might have helped but the thought of working in C felt a bit daunting. I told myself that this would be a good opportunity to explore Go a little further by trying to solve this issue with goroutines. [Read More]

Useful Go Resources

These are resources that I have found useful in learning about Go. I particularly found first and second links to be really useful. Enjoy! O’Reilly class - Introduction to Go Programming with John Graham-Cumming : Intro to Go Curated list of Go frameworks and libraries : Awesome Go Static web engine : Hugo Go Code Layout : Code Layout Go by Example : Site Go Lang from beginner to advanced : Devcast Build web applications with golang : Online book Golang UK Conference : On YouTube Go book list : On GitHub [Read More]

CouchDB Serving up Temperature Data

My goal this week was to find useful and preferably large data sets. Unfortunately, the kind of data I was hoping work with is not available to the public. I am still hoping to work with this type of data in the near future. In the meantime, one of my colleagues gave me access to a nice data set that was immediately useful. This data set contains temperature readings for our office. [Read More]