Carlos Guestrin has always been a big dreamer. As a kid he was a sci-fi fan and longed to build robots. Today, as a professor at the University of Washington he’s one of the country’s leading thinkers in machine learning technology.
He’s still holding out hope for building autonomous robots, but his more immediate aspiration is to bring machine learning technology to the masses. Companies such as Amazon and Netflix use machine learning software to enable personalized recommendations to customers, banks use it to detect fraud and real estate sites like Zillow use it to size up a neighborhood.
We sat down with Guestrin, who is also CEO of machine learning company Dato, to learn how this technology works, what the current and emerging use cases are and why regular businesses should pay attention to it.
What is machine learning and how does it work?
Tom Mitchell, a machine-learning expert at Carnegie Mellon characterizes it as the ability of a system to improve its performance of a task based on experience and data.
Take spam filtering, which has gotten a lot better in recent years. In the past, spam filters would just look for certain keywords like “offer” or “cheap” and mark those as spam. Then the spammers got smarter; they would write “off3r” to trick the system. (See "Spammers in the slammer".)
By integrating machine learning algorithms into spam filters, much more data is taken into account. Now, spam filters weigh various factors to determine if email is spam: Is the sender in your address book? What do you normally do with emails from that address? What characteristics do non-spam emails that you actually read have? The machine learning technology looks for patterns in your email and backs up its decisions with data. Now, not every email with the word “offer” is automatically marked as spam.
How advanced is machine learning technology today, and how widely is it used?
We may still have a ways to go to get to the intelligent autonomous robots I dreamed of as a child, but we have a lot of great technology that is really exciting right now. One of the problems is that a lot of that technology is trapped in academic papers that people like me write and we haven’t been able to transfer those ideas from theory into widely adopted practices.
The key to increased adoption will be to make it so that developers and companies can just use machine learning software and not have to worry about configuring the underlying algorithms that make it work.
Thankfully, there are a lot of companies working on this. There are startups, like mine, as well as big companies investing in these areas. Amazon Web Services and Microsoft Azure both within the last year launched machine learning services on their cloud platforms, which means anybody can try out this technology with a very low barrier of entry.
What benefit can regular businesses get from using machine learning?
Most every industry right now is being disrupted by technology. Amazon has disrupted retail; Netflix has disrupted content; Google has disrupted the advertising industry and so on. Many of those disruptions have used machine learning technology, at least in part, particularly in the areas of recommendations and experience-based software. Amazon knows which products to recommend to you because of machine learning software. Same with Netflix suggesting you watch a certain movie or show. Both companies use a tremendous amount of data about individuals to inform these decisions and integrate that with data they have about all their other customers too.
These advances have led us as individuals to have an increased expectation that the world is personalized to us. Making your customers feel like individuals is a huge differentiator and machine learning helps automate that. If you combine this technology with great customer service, it’s a winning formula.
+ ALSO ON NETWORK WORLD How machine learning ate Microsoft +
We’re on the cusp of many other industries becoming personalized. Take healthcare: Why should the treatment I get be the same as the one you get if we have different lifestyles and genetics? Personalized medicine could really change how we provide care, and machine learning could fuel it.
What components are needed for a successful machine learning deployment and what sort of infrastructure is needed to support it?
Machine learning is predicated on having good data, so a solid deployment will have a significant data storage requirement. There’s also a need for robust computational capacity. Although, a focus of my research has been optimizing compute capacity to crunch larger data sets on smaller machines.
Typically beefier machines with more memory tend to do better than a large number of less powerful machines. There could be some network bottlenecks depending on where the data is housed and where the computation workloads are processed. For these reasons, the cloud has become an increasingly popular destination for machine learning workloads. About half of our customers at Dato use our system in the public cloud.
What’s the difference between machine learning and artificial intelligence?
In the early days, there was not so much of an overlap, but the two fields are converging. The main difference is how the system uses data: Some AI tasks are not driven by data. But, the more the system takes into account data in real-time, the more it is a machine-learning platform.
Take a chess-playing program, for example. Initial versions of this AI program played by the rules of chess: These are the available moves based on rules that have been loaded into the software and it was intelligent enough to figure out the best one. Then, machine learning was incorporated so the system studied patterns of moves, what has worked in the past and what hasn’t. Once a system is more data-focused, then the lines between ML and AI begin to blur.
This story, "5 questions for a top machine learning expert " was originally published by Network World.