The ability to manage large databases in memory is a game-changer in a variety of activities and industries. "Having millions of people connecting to an application that's underpinned by a spinning disk, it's just not going to return information fast enough," remarks David McJannet, VMware's director of cloud and application services, in an interview with ReadWriteWeb. "You see it when people are scanning with a barcode reader or with their cell phone. They scan something, and then they wait three, four, five seconds for the answer to come back. That experience is not acceptable for most new applications, where people expect near-instant response regardless of where they are."
Fire in the Hole
The emerging space of in-memory database systems now includes VMware SQLFire, SAP HANA and Oracle's TimesTen for Exalogic (through a 2005 acquisition). All of these were designed as pure transactional databases for storage, retrieval and updating of records. However, they're being marketed differently in order to break into some market, somewhere. TimesTen, for example, is often treated as a cache for Exalogic, as something that extends rather than replaces the Exalogic experience. That makes sense from a company that doesn't want to cannibalize its own product line but also would prefer not to be eaten alive. HANA is an extraordinarily strong transaction processor, but is being marketed by SAP as an analytics system, better suited to online analytical processing (OLAP) applications. Today, SAP is pairing HANA with another of its acquisitions, SuccessFactors, in hopes that a killer app will expose HANA to a wider audience.
SQLFire's value proposition is what media marketers would call a "pure play" and, as such, is more of a gamble on VMware's part. Having no legacy database platform of its own to cannibalize, VMware is in a position to market SQLFire as a faster transactional processing system than most anything currently deployed - as an all-around player, not an extension or an add-on. And although SQLFire may be well-suited for typical online transaction processing (OLTP) applications that utilize relatively simple schemae, as virtualization systems like VMware's vFabric become more adept, the schematic differences may become negligible.
"Disk is the new tape," proclaims VMware's McJannet. "It's considered to be a permanent mechanism, but not really appropriate for the main day-to-day interaction with data. I think SAP HANA is a fascinating example of this trend coming to life. With SAP HANA, I think the way we would describe it is to say there's huge disruption going on in the data landscape. We think of it in three different directions: One, there's clearly a move to Big Data, where the volumes of data that people are working with are massively different today than they were a few years ago. That's largely about how to do analytics on Big Data volumes, and you see technologies like Hadoop really coming to the fore there. [Two,] we believe there's a similar shift that we would refer to as 'Fast Data,' which is really about this shift to in-memory to accomodate these new kinds of applications. People expect near-instantaneous response at a scale that really didn't exist five to 10 years ago. So as a category, we see a huge shift towards in-memory at the data tier. And SAP, with HANA, is clearly perfectly aligned to this shift. For analytics applications around the SAP data warehouse, they're saying, let's push that into memory so that I can return analytics information for my users far faster than if that data were all stored on a spinning disk."
Semi-Structure
After graciously acknowledging the inroads HANA has undeniably made, McJannet makes the case for a third direction in database disruption, which he describes in terms of flexibility. It's not so much unstructured data, which is normally associated with "Big Data" storage systems like Hadoop, but rather the notion that the structure of data itself may be variable, to suit the application for which it's being used.
"When we think of the disruption that's under way in the data world, I think... you're seeing the emergence of unstructured, loosely typed data constructs in the NoSQL category - things like MongoDB and others. The use case for Hadoop is really all about analytics. A former colleague, now at Hortonworks, described Hadoop as basically like a refinery where you can load up terabytes and terabytes of data onto a Hadoop cluster and then have it refined down into maybe a couple of hundred gigabytes that you actually then want to do analytics on... A traditional data warehouse would just fall over if I tried to load a hundred terabytes of data into Teradata. When I think of the shift to in-memory, it could be for analytical or transactional use cases - like, I need to submit an order to your website, or I need to get a quote from my insurance agent. That's probably the biggest differentiation with respect to SAP HANA."
Today's transactional databases require developers to interact with data not simply using SQL (or, as is often the case, something made to look like SQL) but with procedural languages such as Java, and even traditionally Web-centric languages such as Python and Clojure. "To me, SQLFire just looks like a database," explains McJannet. "I can put data into it, pull data out of it, and submit queries using standard SQL. When I am building my overall application, it acts and is a database." Then for interacting with the data once it's retrieved, indeed, a prime candidate is Java with the Spring framework. Storing the results of these interactions can take place using SQLFire.
"Whether it's Java or Ruby with Rails, it really doesn't matter," he continues. "All of those languages expect you to interact with the database at a certain point. With SQLFire, we are providing an in-memory database that looks, acts like, and is a SQL-compliant database."
Stock photo by Shutterstock.
0 komentar:
Posting Komentar