The MISO Project: Making Interactive Story-Telling Open

Friday saw the release of the keystone for a JavaScript Open Source data visualization library called Miso. It is a joint project between The Guardian and Bocoup, supported by The Guardian Global Development desk, who are funded by the Bill & Melinda Gates foundation.

The foundation of any data project lies in cleaning, sorting and connecting the data through all the steps needed for analysis and presentation. This is the lesson I am learning at the moment looking at a dataset of the order of 10,000-50,000 rows. There are no clear steps involved. There is no methodology. There is no “10 Steps to Understanding Your Data” guide.

What I have are pipes and a work flow system. These need to be constructed for each data project as they are dependent on the type, structure and cleanliness of the original data. Each node could be a parser, a refining tool or a database query. With each iteration of the data exploring process the data needs to be piped through various parts of this circuit. I talked about this in my previous post and will draw up my network flow and post my scripts to GitHub with every project I complete. A neat trick I’ve just learnt is to use command line scripts to pipe your data though individual parsing steps using standard in and out so as to avoid clunky paths.

Being able to see this circuit and having control over the flows make recursive journalism possible and the Miso release, Dataset, is built with this sentiment in mind. Alex Graul, Guardian developer behind the project, explains:

One of the most common patterns we’ve found while building JavaScript-based interactive content is the need to handle a variety of data sources such as JSON files, CSVs, remote APIs and Google Spreadsheets. Dataset simplifies this part of the process by providing a set of powerful tools to import those sources and work with the data. Once data is in a Dataset, it becomes simple to select, group, and calculate properties of, the data. Additionally, Dataset makes it easy to work with real-time and changing data, which pose one of the more complex challenges to data visualization work.

I’ve seen this working in action. I’ve updated a Google Spreadsheet and the resultant chart updated automatically. I am looking forward to working with the library as a journalist and a developer to really glean the benefits of Open Source. Building a properly documented and described piece of software is a massive job which needs to be done meticulously for the Open Source community to take it on. Miso is for anyone who wishes to call themselves a ‘developer’ or a ‘journalist’. It is organisation neutral and part of a growing trend of opening up newsroom tools.

To keep up to date you can follow @themisoproject on twitter and check out the code on GitHub. The lead developers behind the project are Alex Graul and Irene Ros.

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*