[FoRK] Beberg sightings in the wild

Eugen Leitl eugen at leitl.org
Tue Jun 16 02:06:30 PDT 2015


AI Supercomputer Built by Tapping Data Warehouses for Their Idle Computing
Power Sentient claims to have assembled machine-learning muscle to rival
Google by rounding up idle computers.

By Tom Simonite on June 1, 2015


Putting more power behind machine-learning software could make it much more

Recent improvements in speech and image recognition have come as companies
such as Google build bigger, more powerful systems of computers to run
machine-learning software. Now a relative minnow, a private company called
Sentient with only about 70 employees, says it can cheaply assemble even
larger computing systems to power artificial-intelligence software.

The company’s approach may not be suited to all types of machine
learning, a technology that has uses as varied as facial recognition and
financial trading. Sentient has not published details, but says it has shown
that it can put together enough computing power to produce significant
results in some cases.

Sentient’s power comes from linking up hundreds of thousands of computers
over the Internet to work together as if they were a single machine. The
company won’t say exactly where all the machines it taps into are. But
many are idle inside data centers, the warehouse-like facilities that power
Internet services such as websites and mobile apps, says Babak Hodjat,
cofounder and chief scientist at Sentient. The company pays a data-center
operator to make use of its spare machines.

Data centers often have significant numbers of idle machines because they are
built to handle surges in demand, such as a rush of sales on Black Friday.
Sentient has created software that connects machines in different places over
the Internet and puts them to work running machine-learning software as if
they were one very powerful computer. That software is designed to keep data
encrypted as much as possible so that what Sentient is working on–perhaps
for a client–is kept confidential.

Sentient can get up to one million processor cores working together on the
same problem for months at a time, says Adam Beberg, principal architect for
distributed computing at the company. Google’s biggest machine-learning
systems don’t reach that scale, he says. A Google spokesman declined to
share details of the company’s infrastructure and noted that results
obtained using machine learning are more important than the scale of the
computer system behind it. Google uses machine learning widely, in areas such
as search, speech recognition and ad targeting.

Beberg helped pioneer the idea of linking up computers in different places to
work together on a problem (see “Innovators Under 35: 1999”). He was
a founder of Distributed.net, a project that was one of the first to
demonstrate that idea at large scale. Its technology led to efforts such as
Seti at Home and Folding at Home, in which millions of people installed software so
their PCs could help search for alien life or contribute to molecular biology

Sentient was founded in 2007 and has received over $140 million in investment
funding, with just over $100 million of that received late last year. The
company has so far focused on using its technology to power a
machine-learning technique known as evolutionary algorithms. That involves
“breeding” a solution to a problem from an initial population of many
slightly different algorithms. The best performers of the first generation
are used to form the basis of the next, and over successive generations the
solutions get better and better.

Sentient currently earns some revenue from operating financial-trading
algorithms created by running its evolutionary process for months at a time
on hundreds of thousands of processors. But the company now plans to use its
infrastructure to offer services targeted at industries such as health care
or online commerce, says Hodjat. Companies in those industries would
theoretically pay Sentient for those products.

He won’t say any more about what those might be. Sentient has done
research with the University of Toronto and MIT to create software that can
predict the development of sepsis in ICU patients from data such as blood
pressure and other vital indicators, says Hodjat. Results showed that the
software could give 30 minutes’ warning of sepsis developing, with about
90 percent accuracy, but the company has decided not to commercialize that
work, he says.

More recently Sentient has been trying to adapt its approach to work with a
type of artificial intelligence called deep learning. This technique has
recently produced striking breakthroughs in areas such as image and speech
recognition, and it’s become the main focus of work on artificial
intelligence at companies such as Google, Facebook, and Baidu (see “10
Breakthrough Technologies 2013: Deep Learning”). Some of the best results
in deep learning come from running software on very powerful, specialized
computers (see “Baidu’s Artificial-Intelligence Supercomputer Beats
Google at Image Recognition”).

Reza Zadeh, a consulting professor at Stanford University who works on
getting machine learning to work at scale, says that using a big collection
of computers in different places works well for some problems—but not

It is most powerful when a task can be split into small pieces that
individual computers can work on without needing to communicate much over the
Internet, which is relatively slow. But some of the most promising ways to
make machine learning more powerful require different processors to
communicate a lot, says Zadeh.

Google and Baidu have reported major results in using deep learning for
speech and image recognition by using very large data sets or building bigger
artificial neural networks. Both approaches require constant flows of data
between different processors, says Zadeh.

Berberg agrees that deep learning is harder to adapt to a system of hundreds
of thousands of computers linked over the Internet but says Sentient is
making progress. It has thousands of processors working on deep learning at
once, he says.

More information about the FoRK mailing list