Research into recommender systems took off with the Netflix challenge. It started in 2006 and for three years many contenders worked hard to achieve the prescribed error threshold. Finally, in 2009 Netflix awarded the prize, one million dollars.
We can think of two reasons for using distributed machine learning: because you have to (so much data), or because you want to (hoping it will be faster). Only the first reason is good.
Distributed computation generally is hard, because it adds an additional layer of complexity and communication overhead. The ideal case is scaling linearly with the number of nodes; that’s rarely the case. Emerging evidence shows that very often, one big machine, or even a laptop, outperforms a cluster.
In part one we attempted to show that fears of true AI have very little to do with present reality. That doesn’t stop people from believing: they say it might take many decades for machine intelligence to emerge.
How to dispute such claims? It is possible that real AI will appear. It’s also possible that a giant asteroid will hit the earth. Or a meteorite, or a comet. Maybe hostile aliens will land, there were a few movies about that too.
Recently a number of famous people, including Bill Gates, Stephen Hawking and Elon Musk, warned everybody about the dangers of machine intelligence. You know, SkyNet. Terminators. The Matrix. HAL 9000. (Her would be OK probably, we haven’t seen that movie.) Better check that AI, then, maybe it’s the last moment to keep it at bay.
On March 4th Jürgen Schmidhuber tackled “ask me anything” questions on Reddit. The professor was very keen to answer, in fact he continued to do so on the 5th, 6th and beyond. Here are some of his thoughts we found interesting, grouped by topic.
Recently we took a look at Torch 7 and found its data ingestion facilities less than impressive. Torch’s biggest competitor seems to be Theano, a popular deep-learning framework for Python.
Torch 7 is a GPU accelerated deep learning framework. It had been rather obscure until recent publicity caused by adoption by Facebook and DeepMind. This entirely anecdotal article describes our experiences trying to load some data in Torch. In short: it’s impossible, unless you’re dealing with images.
In this post we’ll be looking at 3D visualization of various datasets using data-projector from Datacratic. The original demo didn’t impress us initially as much as it could, maybe because the data is synthetic - it shows a bunch of small spheres in rainbow colors. Real datasets look better.
Python, being a general purpose programming language, lets you run external programs from your script and capture their output. This is useful for many machine learning tasks where one would like to use a command line application in a Python-driven pipeline. As an example, we investigate how Vowpal Wabbit’s hash table size affects validation scores.
Geoff Hinton had been silent since he went to work for Google. Recently, however, Geoff has come out and started talking about something he calls dark knowledge. Maybe some questions shouldn’t be asked, but what does he mean by that?