埃里克·卡瓦纳(Eric Kavanagh):女士们,先生们,您好,欢迎再次回到TechWise。 我叫埃里克·卡瓦纳(Eric Kavanagh)。 我将作为第3集的主持人。这是我们与Techopedia的朋友设计的新节目,这是一个非常酷的网站,显然专注于技术,当然,在The Bloor Group,我们非常关注于企业技术。 因此,设计了各种企业软件以及整个TechWise格式,旨在使我们的与会者对特定的空间有一个真正的好印象。 因此,例如,我们完成了Hadoop,在上一个展览中进行了分析,在这个特定展览中,我们谈论的都是云。
因此,它被称为“云计算势在必行-什么,在哪里,何时何地”。 我们今天要与几个分析师讨论,然后再与三个供应商讨论。 因此,Qubole,Cloudant和Attunity是今天演出的赞助商。 非常感谢您今天的时间和关注,也非常感谢所有其他人。 并且请记住,作为这些节目的参与者,您扮演着重要的角色。 我们希望您提出问题,参与进来,进行互动,让我们知道您的想法,因为显然,本次展览的整个目的是帮助你们了解云计算世界中正在发生的事情。
云命令甲板
因此,让我们继续前进。 首先是主人,您的主人是埃里克·卡瓦纳(Eric Kavanagh),然后我是罗宾·布洛尔(Robin Bloor)博士从机场打电话来,事实上,我们的好朋友吉尔伯特(Gilbert),独立分析师吉尔伯特·范·切塞姆(Gilbert Van Cutsem)也将分享有一些想法。 然后,我们将听取Qubole的首席执行官兼联合创始人Ashish Suchoo的来信。 我们将听到Cloudant首席科学家Mike Miller的讲话,最后来自Attunity营销副总裁Lawrence Schwartz。 因此,今天我们为您准备了很多内容。
因此,云(从上而下的命令)是前几天我想到此概念时想到的。 确实,这些天云计算才是巨大的。 我的意思是,观看这些内容的发展真的很吸引人,而我经常举的例子之一就是网络广播技术本身。 当然,早拨的人听到了一些有趣的技术挑战。 这是云的一个问题,它确实会发生变化,格式会发生变化,标准也会发生变化,接口也会发生变化,有时当您尝试将两个不同的区域连接在一起时,会遇到一些困难,也会遇到麻烦。 因此,这实际上是云计算需要担心的事情之一。 小心建筑! 您可以在最后一个要点看到。
对于我们的网络广播,我们要做的一件事,就是在这里作为附注,我们有一个单独的电话会议供应商。 然后,我们使用WebEx。 我们之所以不使用WebEx音频,是因为坦率地说,几年前我们曾经使用过WebEx音频,但是它以最不愉快的方式崩溃和烧毁。 因此,我们不愿意再次冒险。 因此,实际上,我们使用了自己的名为Arkadin的音频录制公司,并实时将所有这些不同的解决方案组合在一起。 想法是,然后我们可以通过单独的电子邮件应用程序向您发送带有幻灯片的电子邮件应用程序,以防万一例如WebEx崩溃,我们告诉大家拨入电话,向您发送幻灯片的电子邮件并只是浏览更多或如果没有WebEx这种环境,则更少。 因此,您可以解决这些问题,但这些问题无处不在。
但是,云有很多好处。 显然,这是入门的低门槛,您当然可以看一下云计算的先驱者是salesforce.com,这显然对业务(尤其是销售人员自动化)产生了革命性的影响。 但是,然后您有了诸如Marketo和iContact以及Constant Contact和Sailthru之类的东西,而且,天哪,在营销和销售自动化方面,有很多工具,但是还不止这些。 人力资源正在将其应用于整个云游戏,而分析则在云游戏中。 看看那家鲜为人知的亚马逊Web服务公司,他们在使用云计算做什么-规模巨大。 前几天,我听到一个很好的报价,他是一个人,我们与David一起做了大量的工作,David现在已经在Cisco了,事实上,这是收购WebEx的公司。 不确定他们在WebEx上投入了多少钱,但是这并不是我的决定,不是吗? 但是,他这几天在思科工作,他有一个非常有趣,精妙的报价,那就是“没有一朵云,有很多云”,这是正确的。 那里有很多云。 实际上,每个云提供商都是自己的云。 因此,如今的挑战之一是连接云,对吗? 如果您是销售人员,那么直接连接到iContact和Constant Contact以及LinkedIn(例如,Twitter和其他环境)不是很好,那里的其他云只是将对您有意义的业务解决方案固定在一起和你的公司。
因此,这些是要记住的一些问题,但是云仍然存在。 只需知道,本地软件就可以保留。 因此,我们必须在企业甚至任何中小型企业中弄清楚什么,如何定义架构并进行维护,以使您可以利用云而又不会在控制之外的其他地方创建庞大的架构? 因此,显然,整个数据仓库行业都是围绕合并关键信息以分析该信息并做出更好决策的需求而发展的。
好吧,现在Amazon Web Services有了Redshift。 那是我们做过的最大的网络广播之一,是Redshift。 那是很大的事情。 他们正在改变动力,他们正在改变定价结构。 您可以看到,由于传统的企业软件许可,您的价格下降了,部分原因是由于云计算的存在,部分原因是这些人降低了价格,给价格带来了压力。 因此,对于最终用户而言,这是个好消息。 对于试图使用其中某些技术的任何人,一定要牢记这一点。 因此,请记住这一点,今天我们将在展会上讨论这一点。
因此,分析师Robin Bloor博士将成为我们当天的第一位分析师。 因此,我将继续推他的第一张幻灯片,并将钥匙交给他。 罗宾,我想你在这里某个地方,那里。 有了这个,我将把它交给您,地板就在您的手中!
Robin Bloor博士:好的,Eric。 感谢您的介绍。 我碰到了……几天前,我碰到了一项针对消费者的调查,实际上是问了一个问题-您认为暴风雨会干扰云计算吗? 其中超过50%的人同意。 我只是想让您知道,如果您是相信这一点的人之一。 然后,这有点像相信电视上的积雪是因为外面在下雪。
云,您知道,其中之一是,如果您愿意,云的一个简单重要细节就是云实际上是一种或多种数据中心,或者任何特定的云服务都是数据中心。 唯一的是,它是与传统云不同的数据中心。 因此,我将概述有关云的内容,以便您的备份可以更详细地了解云的使用情况,因为没有必要讨论相同的领域。
因此,我想提出的第一点是云服务,您知道吗? 而且由于云计算而实际发生的事情之一是………我称之为品牌的死亡,整个系列的软件品牌拥有巨大的力量,并继续在企业计算中拥有力量。 一旦您进入云,它们将不再具有强大的功能,您知道吗? 购买云服务时,您关心的是应用程序,当然,您关心的是云将为您提供的服务级别,您不希望云服务频繁失败,您关心的是使用成本,并且关心这些东西,因为这是一项服务,但是您不再关心的是,您不在乎其运行的硬件是什么,您不在乎网络技术是什么,也不在乎操作系统是什么它正在运行的是,您不在乎文件系统是什么,甚至不在乎数据库是什么,而这实际上是由云中任何给定的数据库服务专门使用的,您知道吗? 这样做的影响是,云是一个非常庞大的软件品牌,许多软件品牌在云中没有真正的价值,因为,您知道,您以某种方式进入云是为了某种服务,而不再是产品。 因此,我想我可以做几张幻灯片而不使用云的理由,您知道,这些都是,如果您愿意,您知道,这很简单,很明显,但是有人必须说出来,所以,我以为我会的。
因此,不用我的理由…不使用云-如果他们不能提供您想要的那种数据和流程管理,就知道,它根本不符合您的标准。 如果他们不能给您想要的性能,那就不符合标准。 如果云为您提供了灵活的移动方式的灵活性,那么它将无法满足要求。 这就是显而易见的原因,为什么特定的云服务除了进行企业计算之外,不适合很多其他人。
您可能不会这样做,因为您可以更便宜地做到这一点。 云并不总是最便宜的选择。 有些人似乎在想,因为它通常是一种便宜的选择,它总是会更便宜,但并不总是那么便宜。 另一件事是,如果您是从云中获取应用程序,则该应用程序与您正在做的事情不能很好地集成在一起,那么您可能就不会继续使用它,并且您知道,这些都是拒绝的原因。
这是采用的原因。 您知道,可以在云中做的一件事就是进行原型制作活动,这几乎是防弹的。 如果您可以在云中原型化并在数据中心中实现,那么它是完全可行的,并且有很多人在这样做。 您可以使用非关键应用程序从数据中心上载工作,因为它们可能会找到某种类型的云服务,这些服务可以满足您对非关键应用程序的服务级别。 而且,您可以上传特定的应用程序(例如salesforce.com)以及类似的产品到标准应用程序。 每个人都具有该领域的能力,并且该领域不是专门的,而且您知道,传统…云中可用的一切可能都是您所需要的。
因此,我想说的最后一件事,确实是一件很有趣的事情,当您真正寻找云时,一种理解的方式就是一系列规模经济。 整个要点是,您知道在那里运行一个数据中心,并且您将从某个地方拨入该数据中心并使用它,因此,这样做会更好,最好是比你自己做。 因此,您知道,这实际上与规模经济有关。
云提供商选择了数据中心的位置,而定位数据中心的最佳位置就在电站旁边,尤其是在便宜的电站旁边。 因此,北上的一个电站恰好是水力发电之类的。 它通常是最便宜的,您知道吗? 您实际上可以在此处找到数据中心,会发现它更容易。 在这样的地方雇用人比在纽约或旧金山的中心便宜。 您可以根据空调和功率对整个设施进行标准化。 这将为您节省很多时间,因为这意味着您可以为它提供一整栋建筑,而这正是所有云运营商所做的。 他们标准化了网络硬件,标准化了他们使用的计算机硬件,通常是商用x86板,通常他们自己组装。 因此,有些甚至实际上正在构建整个过程。 他们将使用他们会使用的Amazon软件,因为这实际上意味着无需花费任何成本即可使用它。 它们将在所有软件中标准化。 因此,除了一次升级全部,他们再也不会升级任何东西。 他们将组织支持。 因此,他们将向拥有自己的支持设施的众多不同提供商支付支持。 从某种意义上说,它们将具有比您将要运行的服务更多的运行能力,并且具有向上和向外扩展的能力,并且它们将以大多数数据中心无法监视的方式监视其使用情况,因为只能运行一项标准化服务,但是大多数数据中心都运行着一系列的事情。 这就是云真正的全部内容,并且可以某种方式定义云对您的兴趣还是对特定应用程序的兴趣。 因此,您知道,我的粗略经验法则是,在可能实现规模经济的地方,云迟早会接管。 但是,创新和灵活性以及您自己去做的非常具体的事情的方式确实无法实现。 云永远都是第二好的。
好的。 让我把它传给埃里克,再传给吉尔伯特。
埃里克·卡瓦那(Eric Kavanagh):好的,吉尔伯特,我在这里为您提供WebEx的密钥。 支持。 只需单击该幻灯片上的任意位置,然后使用键盘上的向下箭头即可。
吉尔伯特·范·切塞姆(Gilbert Van Cutsem):我想我能控制住。
埃里克·卡瓦纳(Eric Kavanagh):您掌握一切。
Gilbert Van Cutsem:好吧。 开始了。 必须要有云-天空是极限,是城市传说,还是您会怎么看? 这些只是一些讨论和要考虑的事情。
首先,众所周知,从“什么”方面来讲,我不认为有人对此表示怀疑。 SaaS规范化将继续下去,因为该软件实际上从未消亡,它只是迁移到云中,对吗? 我想我在上一版中曾说过。 哦,不,还是Eric在上一版中对我说过。 我认为显而易见的原因(这在某种程度上也可以回溯到Robin)是,在公司方面,公司时间表很容易。 首席营销官总是需要这一切,而他现在需要它。 因此,他正准备将产品推向市场。 很伤心,这对他来说是一个很好的借口。 但是,首席信息官对SaaS和云感到有些紧张,因为整个弹性问题意味着上升的趋势也必须下降。 您必须准备好横向扩展,但也要缩减。 因此,他对此有些紧张。 首席财务官并不紧张,没有比平常多,但他说:“嘿,这是……这会使我们退缩多少?” 您知道,这是臭名昭著的资本支出与OPEX讨论。 它已经很老了,但是,在这个世界上,它非常重要。 然后,当然,最后但并非最不重要的是首席执行官。 他说:“哦!减轻风险!伙计们,你们都很兴奋,但是我们为此做好了准备吗?” 因为风险就是他的想法。
那么,有什么风险呢? 只是一些想法,对不对? 我们在这里与思想领导力打交道,但是还没有完成,因为这都是很新的东西,都是很新的东西。 如果您考虑一下,实际上我们没有很多数据点。 因此,在风险方面,我们也必须处理入职问题,您知道,签署协议的人会说:“是的,这就是我们想要的,要走的路。”这还不够。 您知道吗,您必须要加入团队,还记得电影吗? 回到翻译中,这就是入职培训的全部内容。 而且,正如罗宾(Robin)所说,您知道,本地部署不一定会立即消失。 因此,您必须整合两个世界。 这是一个混合世界。 那么,您将如何做呢? 是80-20,是80-20的帕累托规则,可以吗? 这样够好吗? 然后,当您连接系统时,垃圾就会进/出。 这样可以吗? 这样耐用吗? 因为,您知道要迁移,是否要将企业映射到根系统,该如何做? 我认为非常重要的最后一个是多租户架构,这意味着您自己的数据上的数据隐私(有时称为“拥有自己的数据”)变得非常重要,您知道吗? 一百个人使用同一系统,一个数据库位于系统下方,谁来查看我的数据? 就是我吧? 您绝对确定吗? 数据隐私,数据安全性可以帮助专家。 如果您是CIO,则它将“ I”带回CIO,因为现在您负责信息管理。 如果您是CIO,那将非常有趣。
因此,让我们谈谈“为什么”。 因此,我认为所有这些的战略意图非常非常简单。 如果您是订户,则存在市场压力。 如果您是提供商,则存在竞争压力。 如果有同龄人,会有同龄人的压力。 如果您是订户,那只是市场心理。 每个人都想去使用云,SaaS或任何您所说的云SaaS,我们都需要并且想要去那里。 原因通常是财务上的。 这是显而易见的原因,但是如果您考虑财务方面的问题,就会陷入我所说的“账单与预算”悖论。 您是要订购订阅,无限量使用的系统,每月50美元,500美元还是类似的价格,还是您梦想基于使用情况而只为真正使用的设备付费? 那么,如何使用,基于使用和基于消耗呢? 您要计量所有这些东西吗? 它可能不会马上发生。 因此,您最终将获得一种混合机制,即我每月支付200欧元,偶尔支付500欧元,因为我必须支付额外的消费。 在我看来,Retainer Plus可能是要走的路。
但是,还有一些我称之为广泛前线的隐藏意图,并且我相信这是绝对真实的。 这是控制权的变化,是CIO与CMO之间的关系,权力转移或CMO之间的权力斗争,“我想要全部,现在想要”,而CIO则说,“嘿,这就是全部关于数据,您知道吗?我20年前曾经运行过,当时全都与硬件系统有关,十年前完全与应用程序有关,而今天,所有这些都与数据有关,而且由于我是CIO(信息),所以全部有关。我。我控制住了。” 因此,我认为这就是CMO和CIO之间正在发生的权力转移或权力斗争。
因此,最后,这还太年轻,以至于没有人真正知道我们是处于创新型环境还是处于早期采用型环境。 我相信我们处在早期采用者的环境中,而不是早期大多数采用的环境,只是早期采用者,而是某种程度的中途。 因此,对于客户,最终用户,订户来说,这是要抢先一步,因为CMO希望抢先一步,对吗? 因此,重要的是不要以所谓的收益递减而告终。 有限的先机可能会导致收益递减。 这就是为什么它对确保,确保单点故障不是问题并且尊重数据安全性的各方非常重要。 因此,这将需要大量的变更管理。 因此,最后-差不多完成了,这是最后一张幻灯片-我们将如何做? 您知道,向云迁移和向SaaS迁移如何无缝且轻松? 好吧,通过做两件事:关注–供应–非常重要,而入职,甚至更重要。
埃里克·卡瓦纳(Eric Kavanagh):好吧…
Gilbert Van Cutsem:在那种情况下,天空是极限。 谢谢。
埃里克·卡瓦纳(Eric Kavanagh):是的。 那很棒。 我喜欢极富挑衅性的想法,我喜欢您将一切分解的方式。 我认为这很有意义。 让我们继续前进,推动Ashish的第一张幻灯片,我将把WebEx的密钥交给Ashish。 好吧,继续。 只需单击该幻灯片上的任意位置,然后使用键盘上的向下箭头即可。 妳去
Ashish Suchoo:好的。 谢谢,埃里克。 大家好,我是Ashish,我将向您介绍Qubole。 因此,刚开始时,Qubole实质上是提供大数据即服务平台。 这是一个托管在Amazon云和Google云中的基于云的平台,我们以交钥匙方式提供诸如Hadoop,Hive,Presto和我将要谈论的其他一系列技术,以便我们的客户从根本上摆脱困境大数据基础架构世界中的所有混乱,或者摆脱实际运行此基础架构的操作,而将更多精力集中在他们的数据以及他们想对数据进行的转换上。 所以,这就是Qubole的全部意义所在。
就切实的利益而言,关于Qubole的一种思考方式当然就是基于Hadoop的大数据分析和大数据集成的交钥匙自助式平台,但从根本上讲,它的作用是知道,对于所有大型数据引擎(例如Hadoop,Hive,Presto,Spark,Chartly等)而言,它为这些大数据引擎带来了云的所有好处,以及云带来的一些关键体现。众所周知,云的观点是使基础架构具有适应性,通过适应,我的意思是既灵活又灵活,以适应在任何这些引擎上运行的工作负载,并且在某种意义上使这些引擎具有更多的自助服务和协作能力,您知道,Qubole提供了界面,您可以在其中使用这些特定技术,不仅用于您的开发或面向开发人员的任务,甚至您的其他数据分析人员也可以开始将这些技术的优势转化为自助服务 接口。
关于这一点,我们知道很多事情,网络研讨会,这是关于Qubole带给大数据的云带来的好处的我们的观点之一。 因此,如果您只是比较Hadoop的运行方式和内部部署环境中的工作负载,那么您始终会在静态集群方面进行思考,知道,您已经在解决问题集群,您可以将它们的大小调整为最大使用量,然后将其保留在该位置,然后如果必须更改它们,则必须经历整个采购,部署,测试等整个过程。 Qubole的变化是,通过完全按需创建集群,我们的集群是完全弹性的,我们使用从云存储的对象实际存储数据,然后集群出现,并且您知道,它们是根据由用户,当他们没有需求时,他们就会离开。 因此,这使该基础架构变得更加敏捷,灵活并且可以适应您的工作负载。
灵活性的另一个例子是,今天您可能已经在这里创建了静态集群,并且要记住一定的工作量,并且如果您的工作量发生了变化并且您的基础架构现在需要升级,那么您可能需要在计算机上增加内存之类的。 同样,您知道,例如,通过Qubole在云上执行此操作就很简单。 您总是可以租用新的不同类型的计算机,而且您知道,可以在几分钟内启动群集并运行100个节点的群集,而不是像等待本地Hadoop那样需要花费数周的时间。
Qubole区别于本地部署的另一关键之处在于,Qubole本质上是作为一种服务产品,因此集成该服务所需的所有工具和基础结构都不需要…无论在什么地方进行部署,您都知道,主要是您拥有该软件,必须自己运行它,必须自己集成它们并实现所有这些优势,而SaaS模型的所有优势都是您了解如何找到线索的线索Qubole提供大数据,而不是自己运行Hadoop。
该幻灯片通常涵盖了我们的体系结构。 当然,我们是基于云的,我们将数据存储在云,Google云和Google Compute Engine或Amazon Web Services的云中的对象上。 我们承担了所有Hadoop生态系统项目,并围绕自动缩放和自我管理开发了关键IP,我们进行了许多云优化,以使这些组件技术在云中确实能很好地工作,因为您知道,云基础架构是与仅在裸机上运行数据和一堆数据连接器以使数据能够在该平台中移入移出非常不同。 因此,这与云平台进行了比较,从而使这成为关键。…关键特征在于如何进行所有自助服务,从而无需强大的功能…您无需在运行此工具时,它不会占用很大的运营资源,但是我们将其与数据工作台结合在一起,看看这些工具是否是分析师的工具,这些工具是否是数据治理工具,这些工具是否是模板工具等等,等等,以便您不仅可以为开发人员带来这项技术的好处,还可以为其他业务用户和企业带来好处。 当然,我们还会将此云平台与您可能已经在使用的工具相关联,而不论这些工具是(利用)使用工具还是只是Tableau,或者是否正在使用更多的数据仓库类型的产品(例如Redshift和等等等等。
如今,该服务正在大规模运行,现在我们每个客户群实际上每月处理近40 PB的数据。 我们的集群大小从10节点集群到1500节点集群不等,就我所知,就我们可以处理的规模范围而言,据我所知,我们可能会运行一些最大的集群。就Hadoop而言,云上的所有群集都是如此,我们每个群集内每个月可处理约250, 000个虚拟机。 请记住,我们的模型是按需集群,这在减少您的操作工作量以及改善您的工作等等方面具有巨大的优势。
最后,您知道,这是我们Qubole如何向各种公司转型的一个示例。 是我们客户的一个例子。 他们已经在云上,例如,他们正在云上运行Elastic MapReduce,并且那里的数据使用受到相当大的限制。 他们将有大约30多个可以使用该技术的用户。 借助Qubole,他们已经能够将其扩展到公司的200多个用户,这些用户已经看到了大数据用例的扩展,并且它确实带来了,我们称之为敏捷大数据平台的定义,对于许多分析工作负载而言,它确实变得至关重要。
因此,仅此而已,这是有关Qubole的简短入门。 从本质上讲,我们的愿景是如何使企业在大数据方面变得更加敏捷,并且从本质上讲,我们利用云的优势并将其应用于Hadoop周围的大数据技术,以便我们的客户可以利用敏捷的优势和这些优势的灵活性以及云上自助服务性质的好处,从而使其数据需求变得更加有效。 因此,我将在那里停下来并将其移交给Eric。
埃里克·卡瓦纳(Eric Kavanagh):好。 听起来不错,现在,我将其交给Cloudant的Mike Miller。 迈克,我正在向您传递密钥。 只需单击幻灯片,即可开始。 把它拿开。
Mike Miller:看起来我有钥匙。 所以,我会道歉。 我迷路了…我想我忘了随演讲介绍一些字体。 因此,希望您能超越过去并想象它是美丽的。 但是,是的,这很有趣。 我在这里有很长的清单,听说我写下的一些激动人心的事,我很想在小组讨论中回覆您。 因此,我将尝试快速解决这一问题。
所以,我将从Cloudant开始。 Cloudant是数据库即服务,我们的云提供商,实际上,我什至没有新徽标。 不久前我们被IBM收购。 因此,我们是…我将要谈论我们的服务,尤其要专注于尝试以与以前的发言人完全不同的方式使我们的用户和客户敏捷。
Cloudant为构建应用程序的人员提供数据库即服务和其他与数据相关的服务。 因此,我们直接与开发人员互动,并且我们专注于运营或OLTP数据,这与我们之前从Ashish那里听到的分析相反。 关键是,Cloudant的全部价值可以分解为帮助我们的用户做更多的事情,因此可以构建更多的应用程序,增长更多,睡眠更多。 我将详细讨论它们,但是这里的总体思路是,如果您是用户,就知道您在企业中,您正在构建新的应用程序,并向现有应用程序或Web中添加功能移动初创公司,您应该专注于自己的核心竞争力。 以前,也许直到十年前,IT还是一种与众不同的竞争(很遗憾),即使是很好地运行数据库也可以带来竞争优势,这是一种竞争优势。 释怀那些日子已经过去了! 因此,我们真正尝试与用户合作的方式是鼓励他们使用模块化,可重用,可组合的复合服务,其想法是缩短上市时间,提高可扩展性。 而且这里的总体思路是,云不仅是新的东西被推到用户身上,而且确实是一个市场……这是市场的演变,因为人们构建应用程序,使用应用程序以及运行设备的方式在过去的5-10年中,数据规模发生了根本性的变化。 这确实强调了用于构建应用程序以及仅脱机处理该数据和分析工作负载的现有应用程序体系结构。 因此,它打开了整个机遇之门。
因此,Cloudant是一个分布式数据库即服务,我相信它从一开始就真正包含了移动策略,因此它是独一无二的,我将对此进行详细讨论,但是其思想是现在编写应用程序,您不是只为一个平台而写作,对吗? 您正在写一些我可以在云中运行PB级的东西,它还必须能够在台式机或浏览器中平稳运行,并且越来越多的我们看到了这些东西,我们不得不在移动设备上运行或半连接设备或可穿戴设备或我们称为IOT的设备。 因此,我想,您知道,可以很好地处理并利用这些不同客户的应用程序在市场上具有难以置信的竞争力,而我们试图做的就是使人们可以很容易地在单个编程模型中使用单个API进行编写,在规模各异的所有这些不同设备中处理数据。 The interesting thing is, you know, initial uptake in web and mobile, this is where we saw our big subtraction, but even now before the acquisition, we are seeing larger and larger number of enterprise users even in things as what I say as conservative as fidelity investments, right, working with a virtual building, a virtual safe deposit box. So, I think that this market is actually taken off much faster than even we had expected.
Let's talk about cloud and a little bit more and then turn it over. The idea here is that we really make it easier for you to build more and use a service like Cloudant to store the database state of your application and then move that to your different devices and keep things in sync and start contrast on how you build application, traditional stack or you have to buy servers like we heard about before, where you have to provision those and install license things. With Cloudant, we try to make easy. All the data that you will need, all the search services, database, etc. for your application can be acquired by signing up and getting a single endpoint URL and then starting to use that URL. The idea being that, that is a service that uses multiple indexes, some multiple technologies underneath, some proprietary and many open source, but we use them together in a way that the end developer or product team needs to build something. And so, database analytics, very different than they did it in inception where you would have, you know, rows and columns to store business ledgers, now we need to start JSON documents that generally happens over HTTP or using existing open-source APIs and then finally, we give you the things that database should do like a primary index and secondary indexes for, you know, retrieval and LTT and then driving application logic. But in addition, there is a wide range of things like search, geo-special and replication between devices that are very important. So, that's all provided underneath our API.
But, the really distinguishing thing that allows our users to grow and, for instance, why Samsung was one of our earliest and biggest customers is that, you know, Cloudant now is underneath cluster. Each cluster shares enough architecture of three to hundreds of nodes, but we run those in over 35 data centers now globally so that there is always a place for you to store your data within a millisecond of any other cloud provider or most existing data centers. So, one of the big early things that we are challenging in the cloud as well, is how do I split a hybrid architecture for my application service maybe here and my database servers maybe someplace else that will never work. They have to be on the same machine or in the same place. Well, the reality now is that by cobbling together different cloud providers, and this is something that we still do as an IBM company, you can make sure that your database is always within a millisecond of any other place and we take care of the peering agreements and just take down with the cost off the table, something that we worry about. So, Cloudant is really a database as a service, but you can think of it more like a CDN like for your database for data that changes, you know, on millisecond time scale.
And really, finally, I think the major selling point is if you build an application that's successful, you have to decide as an organization whether or not if you want to then grow the 24x7, 365 globally distributed, you know, operation team that it takes to run that at the large scale to whether that's something that now is commoditized as well. And so we focus very heavily on helping on-board new users and new customers and help them make the jump to the cloud and build architectures that use cloud analysts and works everything in a very coherent and scalable way so that is the end, you know, our users focus on building applications and not on surviving their own success.
And with that, I will just say thanks, skipped over some slides that were skipped and I will turn it back over to Lawrence.
Eric Kavanagh: That is fantastic. So, Lawrence, let me hand you the keys to the WebEx here. Just give me one second. There you are. Keys being transferred. Just click on that slide anywhere and use the down arrow.
Lawrence Schwartz: Great! Well, thank you for the handover and, you know, thanks to all the presenters today. Nice way to set everything up and there will be a lot of things to talk about it as I get through with the presentation here. So, again, I am Lawrence Schwartz. I run marketing over at Attunity and, you know, want to talk about some of the issues that we see and then some of the challenges in the space that we are in.
So, a quick overview and introduction to Attunity as a company and who we are. We focus on moving data. So, we talk about moving any type of data anytime, anywhere and enabling that for users. We are a public company based out of the Boston area, or near Boston, and when we talk about the cloud, we have some great relationships, we are part of the AWS network, a big data integration partner, and we have been close to them since the launch of their Redshift, even working with them before that. We have gotten some nice recognition for the work that we have done and as a company, we are in over 2000 places use Attunity, and we are in half of the Fortune 100 companies. So, we got some good experiences.
As you can see on kinda of the bottom of the slide here, a big issue is you've got data that's generated from all different types of sources these days from traditional, you know, CRM systems, all different places on the Internet, all the different places where data could start and then it has to go to places to be analyzed, to work with and to be looked at and we spoke if, you know, getting the data, you know, where it needs to be. So, I am gonna talk about our solutions that we do specifically on the cloud and when you think about that, often times the data, we have somewhere on-premise. So, besides having relationships with places like Amazon, we have very close working relationships with places like Teradata, Oracle, and Microsoft, all the places where data traditionally existed on-premise.
So, when you think about this, you know, and I think it was Eric who, you know, talked about on-boarding is the key to the whole process, right? I have been thinking about the issues to getting data on a system. Now, we are just some of the bottlenecks that exist today and when you look at the people moving data into a data warehouse or a database and to the cloud, we can see a lot of time is spent on what's called the ETL process, the extraction, transformation and loading of the data from where it resides to where it needs to go. If you think about getting the value on the data, that's not where you want to be spending your time and efforts, that's not the most productive area for a data scientist. And the flipside to that is this - very few people who are very satisfied with that process. It's no less than 20 percent. We really find that to be a big process. So, there is the real kind of painpoint bottleneck, if you will, in getting to the cloud and doing that type of on-boarding that people need to do and there's even, you know, real performance issues, you know, you could look at how do you get stuff into the cloud and if you want to get, you know, a couple of terabytes into the cloud, you could certainly ship it to the cloud and there are still places that do that with larger data sets, or a lot of the traditional methods, just don't have the performance to get their to do that. So, it's a real, you know, painpoint in the marketplace as people think about how do they get and how do they move onto the cloud.
So, if we step back in and look at what that means or why that's there and, you know, how this has come about, you know, both Eric and Gilbert talked about the fact that, you know, the data that's on there today, that exists today, you know, on-prem is here to stay, you know, cloud is here to stay. So, that integration becomes all the more important and often times, people fall back on the tools that they have to move over data. Again, there is a lot of ETL or traditional tools out there to kinda move data over in batches, but there's a lot of issues with that. People find that traditional ways of moving data are very time and resource intensive to set up. They often require a lot of scripting, even if they are autonomous in some way, a lot of people, a lot of manpower. There's so many sources and targets, particularly on-premise today to move it into the cloud, you know, all the systems I mentioned earlier, Oracle, Microsoft, Teradata, some managing that whole part of it. And then, you know, looking at the performance as it moves over, being able to have the tools to make sure everything is building quickly, there is a lot of thought systems that exist today aren't well built for that.
And then lastly, a lot of the way people think about moving data is kind of done in the batch process and if you are thinking about trying to do more in real time, that's not the most effective way, kind of using stale data that's not interesting to the organization. So, when you look at what Attunity does in this stage and how we think about it is, it's a different architecture that we are focused on, we really built this from the ground up and thought about when you have to go from Pentaho open-source database out to the cloud, how do you make sure that it's very easy and straightforward to do? So, that requires rethinking, how you do the monitoring and kind of set up for. It's making the whole thing just kind of a couple of clicks to get started. It's really thinking about the movement and optimizing the performance over the channel and working with just a wide variety of platforms because a lot of big organizations kinda have the best degree approach and a lot of different types of databases or data warehouses are ready in their environment. So, you have to think about it differently. You can't just do an extract, you know, dump the data out to some sort of information loaded somewhere. You have to kinda think about the architecture change, how you do the processing, do it more in memory and focus on a more performance version.
So, what does that mean and what does that look like? So, one key tenent to get to the problem with the cloud is, that things have to be easier to set up. You know, that screen there, it's just some screenshots from how we do it, but it's, you know, 1, 2, 3, kinda pick your source and target, pick what you want to do, you want to do one time CDC and then just go. It needs to be no harder than that, you know? I know we just, you know, saw the presentation from Mike and he talked about how easy it was for people to get started with Cloudant. It's the same type of thing, you have to deal with, kinda get going in a few steps otherwise you will start losing the value of it. When you think about the monitoring and control of it, there are some great companies out there, I know you're familiar with, like Tableau and others, who have done a great job in visualizing the end product of data and how to do it. But, you know, being able to visualize the movement process, the management or where's the data set on-premise, in the clouds and moving over, is there a lag, there is a vacancy. Having that viewpoint is critical and that's an important part of moving forward.
Another aspect that becomes important is the performance. You can't just rely on the standard FTP kinda two-way protocol that people have been using for years. As you move more and more data over, you have to have optimized, a file-channel protocol that is geared more towards, you know, one-directional movement most of the time after we think about how you break up tables and ship them out and move them over and you have to give people the flexibility to do that, otherwise you can't get it there in time and if you do that differently, think about it differently, you can get a 10x performance, but you have to rethink the technology.
And then lastly, as I mentioned earlier, you know, you have got a lot different places that databases exist today. So, you got to be able to work with all those and offer the widest kind of amount of support so that people can get onto the cloud. So, what does that mean for users and, you know, and those who are out there who wanted, two kind of quick cases of how people had challenges getting to the cloud, see the value, but then are able to do that if they have the right toolset.
So, one company that we work with, Etix, they do online ticketing, major provider in this space and I know Robin talked about data center offload is kind of a key in this case for the cloud. This is exactly what they are trying to do. They were trying to load and sync their data from Oracle on-premise to Redshift and do that in a timely fashion. And the interesting thing is, you know, go back to what Gilbert said, you know, it's really tough about on-boarding being an issue. They could see the intrinsic value of Redshift, they could see the cost savings, they could see all the advanced analytics that they quickly start doing that they continue for, they knew that value, but there was a roadblock to getting there. In this case, they looked at it and said, "Well, I see the value of Redshift, but it's gonna take them, you know, three months, development effort and time and, you know, maybe hiring the DBA and doing all this extra work to get there." So, there is a real block in the path to do it. Once you have the right toolset to do that, the right data integration capability to do that, they were able to go down from, you know, months of planning to literally just get going in minutes, and that's again lowering that barrier of getting people onto the cloud, we need to have the right capabilities to deliver on the promise.
The last, you know, slide I have here, and kind of another use case is, you know, we've worked with other companies, Philips, you know, well known in many spaces, we work with their health-care division and again, they were trying to go from an on-premise source over to Redshift, in this case SQL Server, and they knew the value, they knew all the analytics, they could do on it and they had done some testing on it, but they saw that without having the right tools, this is something that was gonna take them, you know, weeks and they had been spending actually weeks spinning their wheels and trying to get things moved over once they had the right tools that simplify, get it moved over quickly, they were able to go down and start loading in less than an hour, you know, over 30 million records. So, the real time went from couple of months to about two hours for them. And then they were able to do the things that they wanted to do. They didn't have to focus on the data loading, they could focus on the operational support. They got a much better matrix for all these care, cost and operations. So, you think about the whole challenge, you know, we design that spaces, enabling the data movement and now more than ever with the cloud when you think of it being kind of a remote place to pick your data, you know, this becomes an area that, you know, more and more people need to solve, to take advantage of what's out there. So, that's an overview of what we do and with that I will pass it back to you, Eric.
Eric Kavanagh: Okay. That sounds great. We've got a good amount of time here. We'll go a bit long to get to some of your good questions, folks. So, feel free to send your questions and I've got a few questions myself.
Lawrence, I guess I will start off with you. You guys have been in this space of kinda supercharging the movement of data for a while and you have been watching the cloud very carefully and I've really been kinda surprised at how long it's taken major enterprises, Fortune 1000 companies to fully embrace cloud. I mean, there are, of course, pockets of severe interests, let's call it, in large organizations, but as a general rule, there's been a bit of a reluctance that is only starting to wane in the last year or so, at least from my perspective, but what do you see out there in terms of cloud adoption and readiness of the enterprise to use cloud computing?
Lawrence Schwartz: Sure, I think you are right. It has been a significant change and it's certainly taken time, you know, they have that joke about, you know, that successful - overnight sensation - or really overnight success, that really takes years in the making, and that's been true for the cloud, right? It's… you have seen that kick in the last year, but it's due to all the hard work of a lot of players like Amazon who have been doing this for years, you know, to get the service adopted, the kind of, you know, prove the metal and there's, you know, failures and problems to give the diversity and flexibility that they have, that's something that Redshift offers. So, I think the maturity has gotten there, the confidence has gotten there, you know, the… I think it's infiltrated into a lot of companies through small areas, you know, small use cases, small trials, kind of outside that kinda IT control and with that, you know, those successful kind of periphery projects have proven now, there's now more of a willingness to have the conversations about how that spread. And frankly, you know, there's been additional tool that has, you know, have also come out to make these easier, like what we do and, you know, there is that, not just move the data, but show the value of BI in the cloud, and showing that.
So, it's, in one way, it's an overnight or a big uptick in the last year, but a big part of that's been all the hard work of building up to that. So, now we as a company see a lot more adoption. It's as a business for what we do, it's grown quite a bit and the cloud, you know, we do a lot of on-premise to on-premise movement. Now, cloud shows up in a lot of the conversations as, you know, real business cases, real offloading cases out where a year ago was certainly, you know, just more exploratory. Now, they have got real projects to move. So, it's been nice to see that movement.
Eric Kavanagh: Okay. 大。 And Mike Miller, you had mentioned that you heard a couple of provocative statements that you wanted to comment on, so, by all means, what do you find interesting or what do you wanna talk about?
Mike Miller: Oh, I think Robin, he made a point, his second-to-last slide contrasting where innovation counts. The cloud will always be second best and I'd love to hear a little bit more about that because in my mind, if I was thinking about building, you know, an application or some new service, it's hard for me to think that my organization, no matter what they are, really wants to go engineer-to-engineer with Google, Amazon, IBM, Microsoft. So, I think maybe I misunderstood his point with that.
Eric Kavanagh: Interesting. Robin, Mike has thrown down the gauntlet. 你怎么看?
Dr. Robin Bloor: Well, I mean the point here is that there are a number of situations that I've come across which… where people have gone into the cloud and walked back out and the reason they walked back out was, you know, when it came to actually having emotionally, this was performance driven, but the performance was actually the crux of the application is being built as they couldn't get the low latency they wanted and the cloud was of no use to them. And, you know, the situation was that, you know, actually going into the cloud, even if they were given the ability to measure behavior of the networks for them in the cloud and that workloads in the cloud with something they had absolutely no control over, and because of that, they couldn't create the tailor-made services that they were looking for, and that's a performance edge. I don't think there's anything in terms of, you know, coding that's going to be constricted, what you can do in the cloud. It's service level, it's a constriction… if that's part of where your critical capability is going to be, then the cloud is not going to be able to deliver it.
Mike Miller: Right. The… So, I appreciate that clarification. I do agree, actually, that transparency is one of the big things that here as desire right now from users across many different providers. So, I think you raised a very fair point. When it comes to performance, I think that traditionally it has been very hard to, you know, to go to a cloud provider or any given cloud provider and find exactly the hardware you are looking for, but it will noting kind of the upping the ante in the race to basically free storage between Google and Amazon and other competitors that it is and I think you see the pressure that puts on driving on the cost of SSD, flash, etc. So, I think that's a fun one to watch going forward.
Dr. Robin Bloor: Oh, absolutely correct, you know? I mean, I think there's one of the things that is actually happening is that the second wave is coming on. The first wave was this, you know, this wonderfully tailored services as long as, you know, it's a little bit Henry Ford; you can have it recolor as long as it is black, but, you know, even so, extreme reduction in certain kinds of costs of having the data center. Or, the second thing that happens is, having actually built these huge data centers out, they start these cloud operators, suddenly start discovering things that you can actually do. You couldn't do before because you didn't have the scale. So, there is, I think, a second wave which, to a certain extent, is going to make the cloud even more appealing.
Eric Kavanagh: Okay. 好。 Let me go ahead and bring Ashish as I am gonna go ahead and throw up your architecture slide here. We always love these kind of architecture slides that help people wrap their heads around what's going on. I guess, one thing that just jumps out at me is, of course, YARN. We talked about that on yesterday's briefing. YARN is not a small deal. For those of you who aren't familiar with this concept, it is "yet another resource negotiator." It's, really it's a very interesting development because what happened is in the Hadoop movement, YARN is kind of replacing the engine really, if you will. Our speaker from yesterday will refer to it as the operating system. It's like the new operating system of Hadoop, which of course, consists of the hybrid distributed file system underneath, which is basically storage when you get right down to it, and then MapReduce is what you used to have to use to use HDFS. MapReduce is an absurdly constraining environment in terms of how you get things done. So, the purpose of YARN was to make HDFS much more accessible and make the entire Hadoop ecosystem much more flexible and agile. So, Ashish, I am just gonna ask you in general, since you are mentioning YARN here, I am guessing that you guys are YARN compliant or certified. Can you kinda talk about what… how you see that change in the game for Hadoop and big data?
Ashish Thusoo: Yeah, sure. 绝对。 So, I think, you know, there are two parts to… So, let me first talk about, you know, why YARN was done and then talk about how that potentially changes the game and what's fundamentally still is the same, you know, where it doesn't change the game. I think that's an important thing to realize also because many times you, you know, you get caught up on this hype of say, this is the new, shiny thing and, you know, everything is going to, you know, all the problems are going to go away and so on and so forth. So, but the primary thing is that, you know, the strength and the weakness of the MapReduce API was that it was a very simple API and essentially, any problem that you could structure around being a sorting problem could be represented in, you know, that API. And some problems are naturally, you know… can naturally be transformed into that and some problems, you know, you sort of, you know, once you have just MapReduce at your disposal then you try to fit into a sorting problem.
So, I think the latter is where YARN plays a role by expanding out those APIs by, you know, being able to compose, you know, maps and reductions and, you know, whole bunch of different types of APIs in terms of how the data can be distributed between these two stages, and so on and so forth. You just made that API that much more richer. So, now you have at your disposal, different ways of solving that same problem, right? So, you just don't have to, you know, be constrained by the API and the problem gets solved one way or the other like, you know, if you are, you know, trying to do an analytics, you know, workload, you can express that in MapReduce, you can express that in YARN. The big difference that happens, that starts to happen is, you know, in terms of, you know, the performance matrix that you start seeing, you know, once you start, say programming to YARN and in some cases, a newer set of things, for example, streaming analysis and so on and so forth starts becoming a reality when you start, you know, doing that, you know, those things in YARN.
So, those are the differences that, you know, that thing has brought into the ecosystem. I think it's much, the richness there is much more on the API side as opposed to it being another resource manager, especially in the cloud context. If you think about it in cloud context, the resource manager is actually your… the VMs that you bring up, you know, you have virt… you know, it's not necessarily… Again, this is a big difference between say, on-prem how you are running Hadoop clusters and how you are running in the cloud then, you know, you have like the constrained static set of machines, you want to distribute those machines amongst different resources and they were used for YARN there. But, in the cloud, you know, you can bring up machines left and right. And so, just from the perspective of being a resource manager, it probably doesn't have that, you know, that bigger need and specifically in the cloud, but from the perspective of providing these, you know, richness of APIs which allow you to, for example, the Hive is initiative they can now program Hive to not just to use MapReduce, but have much more richer plans of doing jobs and things like that. It brings those benefits to the ecosystem. I think that is where the true value of YARN belongs. And in the cloud context, definitely, it's not that interesting from the resource management point of view, but it's much more interesting in terms of what it enables other projects to do, in terms of, you know, workloads that now, it now can be used to be programmed on to your data or the previous workloads that can be done in a much more efficient way.
Eric Kavanagh: Right.
Ashish Thusoo: I had, you know, one more just, you know, adding to Mike, you know, there was another provocative thing which was said which is around and, you know, which was around, hey, treating the cloud as yet another data center. I think you… you know, that is one point of view which most companies, you know, look at and say, okay, you know, that's the easiest point of view actually to look at saying that, okay, you know, this is, you have bunch of machines on your, you know, you have compute, you have storage and you have networking on your on-prem data center and cloud provides the same thing out there. So, I am just going to do exactly the same thing that I am doing on my own on-prem data center and do the same thing in the cloud and viola - that's how it should work. What we have found out, you know, having been running the clouds for, the two clouds where, you know, you have the ability to provision VMs within a minute, the ability to use a highly scalable objects to store data and things like that. We have found that cloud actually, the cloud architecture and these inherent abilities actually enable different ways of doing things, you know, and this is what I have talked about in my slide as well, you know, the whole notion of… in just, you know, in… the perspective of just Hadoop, the whole notion of just running the static cluster versus on-demand dynamic clusters, that is something that you don't see happening in an on-prem data center, you know, versus, you know, true cloud where the, you know, there's a enough capacity to be able to support these types of workloads.
And so, I think there is definitely some shift needed. You know, the big fear for me is that if you just treat cloud as yet another data center, you actually… while you, you know, there are lot of other benefits, but there are lot of intrinsic benefits that you might ignore if you, you know, start doing that, security is another one, the way you deal with security and the cloud, there's a lot of differences in terms of how you would deal with, you know, in… from on-prem perspective and so on and so forth. Just wanted to add that in, from my perspective.
Eric Kavanagh: Sure. 是的 没问题。 We have one attendee asking about various types of use cases like logistics and specifically HR, so I threw up this website of Workday, wanted to make a couple of comments on that, and then Gilbert, maybe I will bring you in to comment on the whole concept of architecture. So, in terms of HR, I actually heard a rather well, I will call it, let's say comment from an analyst a couple of months ago, a few months ago I suppose, about going to the cloud for Human Resources. I have been doing some research on this to know lot of HR-type functions are being outsourced to the cloud, certainly stuff like payroll is fairly easy to outsource these days, benefits programs and insurance, that kind of thing, but there is a real serious caveat to keep in mind and Gilbert, this is what I want you to comment on from an architectural perspective, which is you have to be very careful about when you are moving to the cloud for some kind of critical business service because you either want to be very strategic and very thoughtful, meaning you go through the process of making sure that you understand what's going on in the cloud and what's staying on-premise, and there is the folk from Attunity will tell you that truly one of the things they specialize in is making those connections such that they provide the kind of connectivity you need because what's happening with some organizations is they go and they will use Workday for example, to put some of their HR stuff to the cloud, but they don't do it all or they don't do enough or they don't think through it enough, and what happens then? Then they want to happen to manage the cloud environment and their original on-premises environment as well, which means, guess what? He just increased your cost, you doubled your workload and you created lots and lots of headaches for people, and that's usually when someone gets fired and then the guy who comes in has a real mess to clean up. So, you really do have to think through the architecture of the data and the systems and the processes and make sure you dot all your i's and cross all your t's and with that, I will throw it over to Gilbert for comments. I am guessing it will be with that, but maybe not.
Gilbert Van Cutsem: Alright. 是的 So, just another example of something similar, just yesterday happened to me. So, I lost one of my doctors because he went out of business. 我不知道。 It sounds amazing. He was a chiropractor and he went out of business. I don't know why, but, the thing was this - I have no chiropractor and I like to go to a chiropractor, you know, occasionally. So, I find a new one and it's close to, you know, close by and all that. It's all good. And so, they go, as usual, you have to do all the paperwork and let us know if blah, blah, blah. But, the good news is we have a new system because, you know, we're on the Web now, in the cloud. It's all cool. I go like, okay, you know, and they send me a link and I have to do all the paperwork online, which is fine and I put all kinds of things in there about, kind of secret like, you know, social security numbers and that type of stuff and who I am, how old I am… all my details. I put it all there and I submit because of course, I do believe in technology.
And then I walk up to the office, the next day for my first appointment and they go like, "Did you do the form?" I go like, "Yes, Ma'am, I did." "Okay. Then we will go and find it." I go like, "Well, I did do it." And she goes, "Yes, we know because you are the fifth person today to walk in, to walk up to me and complain about that's not finding the form." And I go like, "But, you can't be serious about that. This is pretty confidential information. Where is it?" This happened to me yesterday, yeah, which brings back the whole issue and the whole idea of who owns the data really, right?
I know you move to the cloud and people get onboard it into a new system like in this case, my chiropractor and they subscribe to a new system. It's in the cloud, it's all safe, it's fully multi-tenant, they used to have it on-premise system, all the data was moved into the new system, but now apparently, they can't get it out.
Eric Kavanagh: Yeah. That's not good.
Gilbert Van Cutsem: So, I don't know where my data is and assume she gets really mad, right? She goes like, "Oh, this is impossible. I pay you money and my customers are, my patients, sorry, are unhappy and with the data is gone, I wanna get away from you. I wanna go to a different system maybe also in the cloud, right?" How do you then move the data of your patients in this case, the data your business owns, to another system? How do I get it out first of all and then load it again? I am sure ETL in the cloud is an answer somehow and we have experts on that, but it's not that easy.
Eric Kavanagh: Yeah, but that's exactly right and folks, I threw up this other slide here, this other, another screen to show you where you can find the archives. So, anytime you want to check out - oh, there's the inside of our website, I don't want to show you that. So, here is the main website and on the right column here you can see a different show. So, TechWise is right here. You click on that and on these different pages where we will actually post the archives. So, we do archive all these webcasts.
Actually, I wanna throw back over to Mike, I suppose, and then also to Lawrence to kinda comment on this story that Gilbert just told. So, Mike, there is some, kind of, now this is kind of a small-business concern. You guys are more focused on big business, but nonetheless, if a large company who works with you and they want to go somewhere else, how do you manage that movement of the data and securing the data and so forth?
Mike Miller: Yeah. 这是一个很好的问题。 It's one that used to come up a lot more often than it does now in sales calls, which I find to be an interesting anecdotal piece of evidence for a call. You know, I think that first of all, we are talking about a lot technologies, or at least employment models that are relatively new. This is very early in the cloud, right? We are talking about things like cloud, or in the case of data, we are talking about analytics services like Hadoop for databases and then NoSQL or NewSQL formats. You know, these are fundamentally new technologies and especially around things like, Hadoop and NoSQL, all of the ancillary services, the connectors, right, the… you know, if I want to find somebody that consults on Oracle, that's something I can find, but that entire ecosystem is just kinda spinning up right now.
So, it's getting easier day over day to say, okay, you know, give me a service that can read from 'x' traditional system, put it into Cloudant and do something with it and then put it back into 'y' traditional system, right? So, now they are very, you know, there are quite a few those things and it's actually more challenging, I think, for a typical user to understand what is the best choice, right, if I want to connect all the new technologies on-prem and then in the cloud.
So, I think as a cloud vendor, it's really on us to be very opinionated about that and to help walk users through the landscape of possibilities because the shift's a lot of new and I think that the average user, whether it's a CTO, CIO or whether it's actually developer, is coming up that learning curve fairly quickly. I think that a lot of the kind of baseline stuff is being worked out, cross-cloud connectors and, you know, taking away the really most basic worries about say, you know, bandwidth cost and whether or not you are going out on the wide area network versus staying on, you know, VPN the entire time. A lot of those things have been kinda abstracted away and what is the true promise of the cloud.
But, in general, I think you are also seeing, you know, that anecdote that we heard was, you know, something that is probably isomorphic to, you know, what will happen to your buying into a brand, you know, in a past lifetime, you know, what happens if that brand doesn't deliver, how much can I really trust that brand? I think you are seeing exactly the same thing happen in the cloud and, you know, I think that companies like Microsoft, Amazon, IBM and Google are, you know, very much stepping up and saying that there will at least be multiple pillars of trust and making sure that you are not going in with a company that's going to dry up and swallow your data, or worse, lose it or distribute it, right? And so, they are, at least, they are independable and they are anchoring, you know, the development of such ecosystem. But, I say to close, it's very early and a lot of that tooling is just getting started and, you know, I think you are going to see consulting services, you know, really putting a lot of focus on that in the very near term.
Eric Kavanagh: Yeah. That's a really, really good comment you just made there. I like that "pillars of trust" concept because the other thing to keep in mind here is you do once again have a number of fierce competitors vying for market share and for IT span, it's just like the old days all over again. Really, in the old days, by which I mean last year, you had IBM and Oracle and Microsoft and SAP and then Computer Associates and Informatica and all these companies, Teradata, etc. In the new world, now you have got, of course, Microsoft with their Du Jour, you have got Google, you have got Amazon Web Services, you know, you have Facebook in certain context. So, you have all these companies that are not necessarily so excited about working with each other, but you do have things like APIs. And so, one of the nice things that APIs really are crystallizing into the connectors that hold together the larger cloud, I suppose, and I want to throw up a slide for Lawrence to kinda comment on all this.
Yeah, Lawrence, obviously, you guys have specialized in the space for a while. So, I think you do have awesome advantage over maybe some newcomers. But, nonetheless, these are all very serious concerns because how data gets stored in the cloud is different than how it gets stored on-premise. Then I think that Mike makes a really good point that this whole space is just starting to take shape and it's gonna take a while for things to seriously fall into place and to crystallize. So, what's some advice that you have for companies that you… I guess, you basically concur with Mike, or what do you think?
Lawrence Schwartz: Yeah. I think it's, you know, what we see is when people are taking advantage of the cloud for a lot of use cases as compared to on-premise, you know, they are looking at kind of, you know, two different things. One is, they are looking at, you know, as we talked about this a little bit earlier, how do I… how does it incrementally add value to what I do, how do I, you know, how is it kind of an add-on? And so, you know, when back to when I talked about the Etix as a company where, you know, they are not moving all their operations over to Redshift, you know, yet per say, but they're saying, "I do a lot of work on Oracle, I wanna offer some of this to some kind of analytics from different environments, you know, kinda figure out, maybe do some sandbox stuff there, and, you know, and then learn about my business that way, and that way they can kind of carve out what they want, move it over there and do the work and, you know, it's less of a concern with moving, you know, everything over and all the records and whatnot. So, I think they look at that as one way that to take advantage of it with having less issues.
I think the other thing is people are also looking at these cases that are and aren't excellent fit for the cloud that are very, very hard to do in other ways. So, I will take another example, you know, we work with a company called, you know, iN DEMAND. They are video on-demand player. They do this work for Comcast and all of this and they will actually, you know, take the data that they are working with, they will take the media files and they will supply it to the cloud for doing their processing, do their processing there, and then they will consume it back for their on-premise customers. And then, you know, that gets upstairs to third parties that consume reviews. So, it's, you know, if you want to think about how the company is approaching it, it's, you know, how do I get my… how do I add value, how do I maybe not move the whole business at first, how do I get the right use cases, how do I add incremental value to what I do? And that helps kinda build about the confidence on what they are doing and as part of the process, and of course, you know, a key piece of that is, you know, making sure that they can do that securely and reliably and, you know, we make sure to the latest levels of encryption and other things to take care of that as much as we can on the transport side. But, that's how I think a lot of companies are approaching the problem.
Eric Kavanagh: Okay. 好。 And maybe Ashish, I will throw one last question over to you. I am just throwing up, actually, I like your architecture slide. Even this slide I think is pretty neat. So, one of the questions in, you know, HDFS of course, by design the default is to save every piece of data three times. You can adjust that, of course, you can make it twice, you can make it four times, that does provide some overhead over time, obviously, but it is a way of backing up data. Anyway, that was the whole idea, one of the key ideas, right, from HDFS originally is redundancy, is not wanting to lose data. I've kind of been wondering how that's going to affect things like replication servers, quite frankly, when Hadoop does that natively.
But, one of the attendees is asking - "Can you request physical backups like tape for your cloud data? I read of a company that had their cloud management console hacked and their data and online backups trashed."
You know, we are hearing about these breaches all the time, they are getting more and more serious, they are killing major brands like Target, like Home Depot, etc. So, security is an issue and backup and restore is an issue. Can you kinda talk about how you guys address things like backup and restore and security?
Ashish Thusoo: Yeah, sure. So, we… So, I will talk about that and talk about HDFS first. So, as far as Qubole is concerned, you know, we… since we work on the cloud, we use the objects store there to store data. So, again, this is one of the other key differences why, you know, big data service on the cloud becomes different from on-prem. On-prem, we have always talked about, you know, HDFS and so on and so forth, but if you go to the cloud, a lot of the data is actually stored in their object stores. For example, that could be an S3 on AWS, Google cloud storage on Google Cloud, on Google Compute Engine, and so on and so forth.
Now, many of these object stores have built-in capabilities of providing you things, you know, these object stores, by the way, you know, one of the big differentiators from real clouds to actually your own data center is the presence of these object stores and the reason that these object stores are cool pieces of technology, you know, they are able to provide you very cheap storage and along with that they are able to provide you things like, you know, having the ability to actually have a disaster recovery thing built in and, you know, as part of that interface, you don't have to think about it. And also, they have tiered, you know, there is tiering there as well. For example, S3 has high availability and it's online access, but it's much more expensive. It's more expensive than say, a glacier storage on AWS, which is low, you know, it gives you, you know, the turnaround time is like four hours or something like that and it's much cheaper. So, you start thinking of, you know, those types of services. I think cloud providers are essentially providing those types of services to augment the need for things like tapes and so on and so forth. And also, to provide you disaster recovery or rather, you know, replication built in into these systems so that, you know, you are protected from disasters, regional disasters and things like that.
So, that is what Qubole heavily, you know, depends upon and the great thing is that a lot of… all the cloud providers are providing this. These are fundamentally very difficult problems to solve and by being built into some of the object stores that these cloud providers provide, you know, that is one more additional reason of, you know, storing this data, you know, in some of these object stores and using the cloud for that as opposed to trying to, you know, figure out, you know, replication, running two Hadoop clusters across different, you know, regions and, you know, trying to replicate data from HDFS from one region to the other, which is doable, we did that a lot when I was back at Facebook running this stuff there, but, you know, fundamentally, the object stores in the cloud just made it that much more easy.
Eric Kavanagh: Okay. 大! Well, folks, we've burned through an hour and 15 minutes or so, a lot of great questions there and a lot of great presentations. Thank you so much to all of our vendors today and of course, to both of our analysts on the show today. A big thank you, of course, to Qubole, Cloudant and Attunity. We are gonna put the archive up at insideanalysis.com. I showed you where that goes, and big thanks to our friends at Techopedia as well.
So, folks, thank you again for your time and attention. This concludes Episode 3 of TechWise, our relatively new show. There is Episode 4 coming up pretty soon. It's gonna be on the big data ecosystem. So, watch for information on all that. And then till then, folks, thank you so much. We will catch up with you next time. 照顾自己。 Bye-bye.