Monthly Archives: April 2012

Channeling Dr. Seuss to explain real-time system analytics

Most companies have a UI/UX person.  Very few companies have a UI/UX person who doubles as a cartoonist and satirist.  Well, Nodeable does: Mike Evans.

On the downside, Mike rides a hipster bike and wears Bono-like sunglasses when he rides. He also has terrible recommendations on where to find good hot chocolate in San Francisco.

On the upside, he’s an award-winning film producer.  Not that Nodeable is in the habit of making movies.  But if we SERIOUSLY pivot, he’s the guy to make our My Own Private Utah movie.

Here’s one of his recent graphics for a presentation Dave is due to give in a few weeks.  Yes, it looks like it comes from a Dr. Seuss book.  But how many Dr. Seuss books do you know that deal with the hot topic of chewing through massive piles of system data to deliver actionable insights into how to optimize your infrastructure, and helping you resolve issues before they become problems?

Yes, that was a short infomercial.  But no, there are no such Dr. Seuss books.  Thankfully.

Vendor lock-in may well be the least of a CIO’s concerns: Defining openness in the cloud

There are many great reasons to use open-source software. Avoiding vendor lock-in perhaps one of the weakest. It’s not that lock-in isn’t a real concern, but it’s generally not a CIO’s first consideration. The first consideration is getting stuff done.

Which is why Rackspace CEO Lanham Napier is almost certainly wrong to castigate Amazon over proprietary lock-in.  Not wrong because he’s incorrect.  Wrong because it’s an ineffective strategy.

It’s also why Red Hat’s Gordon Haff is likely wrong to take on VMware using the same argument.  VMware’s Matthew Lodge tweaked his open-source peers over the “I’m the most open” discussion, arguing that “While the ugly sisters were squabbling, customers were getting on with business and choosing their Cinderella as VMware.”  What irked Haff most, however, was Lodge’s follow-on comment: “Openness is not about how you write software, it’s about what you allow your customers to be able to do.”

Haff responds:

He has a very good point, but again, it doesn’t really go very far, which is why Red Hat for years has emphasized value, not fluffy intangibles in its field marketing.  Yes, the company will talk about vendor lock-in for its high-level marketing messages, but the salespeople walking in to talk with a CIO?  They’re talking about performance-to-cost ratios over competitors like IBM and HP.  When it comes to talking business, Red Hat is all business.

And rightly so.

When a CIO reports to the CEO, she can’t point to “but look at all the freedom I gave us.”  She knows she’s going to have to deliver tangible results.  Which is why ex-JBoss veteran (and Cloudbees board member) Bob Bickel is right to point to alternative ways to be open in the cloud:

Again, this isn’t to fully deprecate the value of open source.  But it is to suggest that there are various ways to define openness in cloud computing, and source code is just one aspect among several, and perhaps not even the most important one.

At Nodeable, we feel that a nuanced approach to openness is the right one.  We use some great open-source software at the heart of our cloud analytics service, including Storm and Hadoop, but we don’t yet see how it would make much sense (or be of any real use) to anyone to open source our platform.  We do, however, see some value in open-sourcing our agent technology, and are exploring this.  Most importantly to us, however, is the ability for our users to easily get their data into and out of our analytics service.  And we do.

Data, source code, APIs, etc.  All factor into the new world of open.

Marten Mickos: The LAMP stack is dead, and cloud has killed it

For the past 10 years, the LAMP stack has laid waste to proprietary software stacks.  Yes, Microsoft has held onto gargantuan profits, but LAMP has become the foundation for leading web services, whether Google or Facebook or [Insert Big Web Brand Here].  LAMP is the future.

Or was.  That is, until cloud killed it, as Eucalyptus CEO (and former MySQL CEO) Marten Mickos posits in a great keynote from the Percona Live: MySQL Conference & Expo 2012.

No, LAMP is not going away.  But it’s comfortable existence as a somewhat coherent instantiation of Linux-Apache-MySQL-Perl/PHP/Python is gone.  Mickos, despite speaking at the MySQL conference, was quick to point to big changes in the database market, many of them not conducive to MySQL’s dominance in the cloud:

As he notes, we’re less concerned with stacks anymore:

From 2000 to 2010, it was all about the LAMP stack. To build an app, you took one Linux, one Apache, one database, one language. You stacked them on top of each other. You multiplied them to scale out….Today applications have many different languages and databases, and they’re not stacked upon each other.  We don’t stack things anymore.

Furthering this idea, today, in the cloud, users spin up new VMs. Each VM can have its own version of Linux. Each app typically has many databases (MySQL, Memached, Sphinx, Hadoop, etc.) and they are not stacked on top of each other. Scaling out may mean scaling out with different instances.

On top of the database, you can have many languages. The same app may use PHP, Python, Ruby, JavaScript etc. Again, these are not stacked on top of each other but they are next to each other, and perhaps even next to each other in different VMs.

Mickos points to Amazon Web Services as an example: 10,000 images available for people to use. Those are the new distributions. Not Ubuntu or RHEL or whatever flavor of Linux you prefer.  They contain numerous different combinations of tens of pieces of software.

This doesn’t, of course, render a Red Hat or MySQL (Oracle) obsolete.  But it changes the game considerably, and makes managing the diversity of infrastructure within the enterprise much harder in many ways.  Indeed, following Randy Bias’ thinking, this complexity may be the exact opposite of what is needed to scale in the cloud.  While Mickos is right to point to the disparate software configurations Amazon offers on AWS, AWS itself uses a very unified set of infrastructure components, down to the hardware.

Still, Mickos’ point feels correct: right or wrong, we no longer rely on a unified LAMP stack to run our applications.  We use a variety of Linux distributions, databases, programming languages, etc.  Whether we’ll get back to a unified infrastructure “stack,” in the way Bias suggests we should if we hope to scale, is an open question.  Maybe it will be the successor to LAMP.  But for now, it’s complexity all the way down.

Making sense of the data deluge (Why aren’t we learning from our data?)

One of the biggest problems with systems management is that existing tools offer a complex glimpse into what happened, rather than giving users insight into what’s happening (and will likely happen soon, absent intervention).  This isn’t a limitation in technology: it’s a problem with our imagination and willingness to apply the cloud to tackling systems problems, including cloud systems.

O’Reilly’s Mike Loukides picks up on this in a great article pointing to the fundamental flaws in system monitoring tools:

It’s easy to detect problems when something goes out of spec: If you have a fever, you know you’re sick. But how do you detect problems that don’t set off an alarm? How many diseases have early symptoms that are too subtle for a human to notice, and only accessible to a machine learning system that can sift through gigabytes of data?

Loukides goes on to bemoan the fact that “the state of system and network monitoring hasn’t changed in years,” and that observation applies equally well to HP OpenView as it does Splunk.  These hugely expensive tools do little more than store everything and provide complex ways to query the database of arcane log data.

Bravo.

The irony is that employees are starting to find better, and dramatically cheaper, ways to track what’s going on in their systems.  A recent Forrester survey finds that rank-and-file employees spend roughly $2,000 each year on hardware and software to get their jobs done.  Some of that money is going to analytics services (like Nodeable) that help them keep abreast of the subtle trends and anomalies in their systems.  We’ve opened the door to over 400 companies to participate in our beta, many of whom are big-name companies where employees are tired of paying big-name price tags for systems that don’t actually help them very much.

Nor is Big Data machine-learning isolated to systems management.  The mobile devices we slavishly carry with us are giving off all sorts of data, which can tell us much about how to change the way we eat, exercise, drive, etc.

It’s a Big Data world.  It’s just unfortunate that so many of our biggest vendors, those arguably in the best position to invest in making this happen, are doing so little.  IBM, as pointed out by Loukides, may be the exception.  But in many cases, it’s the startups of the world that are leading the way.

If the cloud is all about simplicity, why is it so complex?

Software now leads hardware in terms of overall tech spending, and the biggest growth in software spending is actually not software at all, according to Forrester.  No, the biggest growth drivers in tech today are SaaS applications, general business intelligence products, and specialized analytical tools.  Sure, some of these run behind the firewall, but increasingly businesses are following consumers to the cloud.  Where businesses don’t seem content to follow consumers, however, is in the simplicity of their cloud products.

Some enterprise vendors get this.  Take Box, for example, whose cardinal rule for its content collaboration services is simplicity.  If real human beings don’t want to use the software/services, then why should enterprises waste money trying to force them to do so?

But far too many cloud vendors mire themselves and their users in complexity.  As Charles Babcock writes, “I am struck over and over again how easy it is to discuss cloud computing in high sounding terms, while those plunging into the cloud are thrust into a welter of new technology processes and complex responsibilities.”  Bingo.

Cloudscaling’s Randy Bias goes even deeper, picking apart the problems Infrastructure as a Service vendors have had in making it super-easy to run systems in the cloud.  As he notes, “[M]any engineers see understanding and developing complex systems as a rite of passage.  In reality, the true test of a great engineer is their ability to make things simpler, not more complex.”  As he goes on to say, complex systems tend to fail, but simplicity enables scalability.

Just ask Amazon, whose public cloud is by far the most widely used in large part because it’s comparatively easy for developers to use.

I see this in the systems management world, which actually tend to compound the problem of complex cloud systems with complex tooling that exacerbates the very problem it could help to solve.  Bias talks about the problems inherent in presenting users with too many choices (e.g., multi-hypervisor IaaS offerings), but the same is true with the language used to describe systems.

For example, we talk a lot about real-time, but the best description may actually be this one that I found: human real-time.  Ultimately, a user really doesn’t care how the vendor goes about processing and delivering insights into systems so long as it’s happening as fast as they need.  Nodeable uses continuous computation.  Other vendors may prefer an alternative.  But, again, I suspect very few users actually care.  They just care that the analytics serve up insights that are timely and germane to their jobs.

Cloud was supposed to make life easier for the enterprise, and I think it’s accomplishing that goal, on balance.  But we have a long ways to go toward simplifying the cloud for users, and not just in how we explain it.  Recent history suggests that the companies who do best at simplifying complex systems – think Apple, Microsoft, Amazon, and others – are those that win big in terms of revenue.  It turns out simplicity pays.

Tagged , , , ,

Big Data and the race for real-time: things are about to get interesting

Life is about to become much more interesting in Big Data land.  While data have ever been with us, it’s only recently that Hadoop and other Big Data technologies have dramatically altered the economics of collecting, managing, and utilizing data to drive businesses.  Now it appears that even these open technologies, from Hadoop to Cassandra, are about to be transformed themselves, leaving the world of batch-oriented data processing behind in favor of real-time analytics.

Hold on to your seats.

As powerful as Hadoop is, it has one significant shortcoming: it’s batch-oriented.  Even a few years ago, this was fine, as just being able to gather and crunch the data after the fact, at a much lower cost than traditional data mining, was a huge win.  But, as Todd Papaioannou, formerly vice president of cloud architecture at Yahoo! and now founder of Continuuity, argues, “people are expecting much more of a real-time experience” on the web, something that Hadoop hasn’t historically delivered.  He further notes,

It’s not clear to me, as an industry, that we have nailed that [real-time analysis] problem. It is clear to me that we need to solve that problem, and that the next big wave of applications is going to be real-time and to get to real-time, you have to take the human out of the loop.

This isn’t to suggest anyone should stay on the sidelines and wait for Hadoop (and other NoSQL databases) to achieve real-time status.  Far from it.  Many an industry is already transforming itself through the Big Data intelligence that Hadoop and other technologies enable, batch orientation and all.

After all, just starting to work with data at all is a big deal.  The stakes are huge, as a wide variety of industries are sprinting to take advantage of the treasure troves of data available to them.  As the Wall Street Journal reports, it used to be enough to mine receipts and other consumer data to find nuggets of information that could affect your business.  But now the Holy Grail is “getting and making effective use of information as it happens.”

We’re not far off.  Just as the financial services industry used to operate on a 20-minute time lag with stock information, but now streams real-time stock information to traders and others, so, too, will industries as varied as retail and agriculture increasingly base decisions on up-to-the-second information about purchasing trends, weather patterns, and more.

Of course, we still need better Big Data-savvy applications to make sense of the data.  But these are coming.

What is needed now, perhaps more than anything else, is to enable Hadoop, the clear front-runner in the NoSQL sweepstakes, as a real-time data storage and processing tool.  Continuuity doesn’t indicate how it plans to accomplish this, but there are others working on this same problem, including Nodeable.  (Not surprisingly, we believe we have cracked the code.  :-)  Regardless of who crosses that real-time finishing line first, many industries are benefiting from the Big Data gold rush today, and will benefit even more when we can make Hadoop real-time and enable a host of data-intensive applications to tame some of Hadoop’s complexity.

Again, the stakes are huge, which is why so much investment is going into this, both in terms of venture capital and enterprise IT.  As these converge, expect to see a transformation of how businesses operate.  In real-time.  At high levels of efficiency.

Game on.

Tagged , , ,
Follow

Get every new post delivered to your Inbox.

Join 57 other followers

%d bloggers like this: