Sabbatical week day 2: I fail at Octave

I’m taking a sabbatical week over the holidays. This week’s posts will serve as a sort of report of what I got up to the previous day instead of the usual schedule – wish me luck that I achieve even half of what I’d like to.

English: A selection of programming language t...

After I managed to get the toggl and toshl datasets on Monday it was time to do something useful with them yesterday. Turns out, I'm not very good at doing useful things with datasets because my biggest achievement of the day was coming up with a plot of the data.

You know that all awesome data format that is JSON? Every programming language except Java has a nice and easy interface for loading and saving right into native data structures. This makes it perfect and all 'round awesome! So it seemed only natural that my node.js scripts for fetching data would be storing it in JSON for future use.

Or so I thought.

If there is one thing I learned in ml-class it's that one should always take some time to first model their machine learning algorithm in a mathematical language like matlab/octave before implementing in a production-like language. Something about how all those matrix operations are easier and how having a language created especially for the task makes it all that easier to play around.

I guess octave is to machine learning as InDesign or Illustrator are to web design?

Turns out not only doesn't Octave have a native way of reading JSON, but even when you find a library it is impossible to say Here is a file, make it a string yo! Just doesn't work. All files need to have a format or something ... it's really quite silly.

Luckily there was a simple solution - just dump the data as a column of numbers and Octave couldn't have been happier about it.

As mentioned, I didn't get very far, this graph is the extent of my achievements yesterday:

Toshl and Toggl plotted

Just for fun I tried running linear regression on this data and, as expected, it failed horribly. The lowest cost is a function along the lines of y = **-6.5541e+88*x + **-4.8840e+90 ... I'm not even sure coming up with fake-ish quadratic and cubic function elements would do much good in this case and since I only have a single parameter neural networks wouldn't do much good either.

And either way, anything that comes even close to modeling this data will suffer from horrible overfitting and won't be much use anyway ... luckily I have some other ideas I can try.