Foreword to the Second Edition

My journey with Pivotal began in 2014 at Morgan Stanley, where I am the global head of database engineering. We wanted to address two challenges:

  • The ever-increasing volume and velocity of data that needed to be acquired, processed, and stored for long periods of time (more than seven years, in some cases)

  • The need to satisfy the growing ad hoc query requirements of our business users

Nearly all the data in this problem space was structured, and our user base and business intelligence tool suite used the universal language of SQL. Upon analysis, we realized that we needed a new data store to resolve these issues.

A team of experienced technology professionals spanning multiple organizational levels evaluated the pain points of our current data store product suite in order to select the next-generation platform. The team’s charter was to identify the contenders, define a set of evaluation criteria, and perform an impartial evaluation. Some of the key requirements for this new data store were that the product could easily scale, provide dramatic query response time improvements, be ACID and ANSI compliant, leverage deep data compression, and support a software-only implementation model. We also needed a vendor that had real-world enterprise experience, understood the problem space, and could meet our current and future needs. We conducted a paper exercise on 12 products followed by two comprehensive proofs-of-concept (PoCs) with our key application stakeholders. We tested each product’s utility suite (load, unload, backup, restore), its scalability capability along with linear query performance, and the product’s ability to recover seamlessly from server crashes (high availability) without causing an application outage. This extensive level of testing allowed us to gain an intimate knowledge of the products, how to manage them, and even some insight into how their service organizations dealt with software defects. We chose Greenplum due to its superior query performance using a columnar architecture, ease of migration and server management, parallel in-database analytics, the product’s vision and roadmap, and strong management commitment and financial backing.

Supporting our Greenplum decision was Pivotal’s commitment to our success. Our users had strict timelines for their migration to the Greenplum platform. During the POC and our initial stress tests, we discovered some areas that required improvement. Our deployment schedule was aggressive, and software fixes and updates were needed at a faster cadence than Greenplum’s previous software-release cycle. Scott Yara, one of Greenplum’s founders, was actively engaged with our account, and he responded to our needs by assigning Ivan Novick, Greenplum’s current product manager, to work with us and adapt their processes to meet our need for faster software defect repair and enhancement delivery. This demonstrated Pivotal’s strong customer focus and commitment to Morgan Stanley. To improve the working relationship even further and align our engineering teams, Pivotal established a Pivotal Tracker (issue tracker, similar to Jira) account, which shortened the feedback loop and improved Morgan Stanley’s communication with the Pivotal engineering teams. We had direct access to key engineers and visibility into their sprints. This close engagement allowed us to do more with Greenplum at a faster pace.

Our initial Greenplum projects were highly successful and our plant doubled annually. The partnership with Pivotal evolved and Pivotal agreed to support our introduction of Postgres into our environment, even though Postgres was not a Pivotal offering at the time. As we became customer zero on Pivotal Postgres, we aligned our online transaction processing (OLTP) and big data analytic offerings on a Postgres foundation. Eventually, Pivotal would go all in with Postgres by open sourcing Greenplum and offering Pivotal Postgres as a generally available product. Making Greenplum the first open source massively parallel processing (MPP) database built on Postgres gave customers direct access to the code base and allowed Pivotal to tap into the extremely vibrant and eager community that wanted to promote Postgres and the open source paradigm. This showed Pivotal’s commitment to open source and allowed them to leverage open source code for core Postgres features and direct their focus on key distinguishing features of Greenplum such as an MPP optimizer, replicated tables, workload manager (WLM), range partitioning, and graphical user interface (GUI) command center.

Greenplum continues to integrate their product with key open source compute paradigms. For example, with the Pivotal’s Platform Extension Framework (PXF), Greenplum can read and write to Hadoop Distributed File System (HDFS) and its various popular formats such as Parquet. Greenplum also has read/write connectors to Spark and Kafka. In addition, Greenplum has not neglected the cloud, where they have the capability to write to an Amazon Web Services (AWS) Amazon Simple Storage Service (Amazon S3) object store and have hybrid cloud solutions that run on any of the major cloud vendors. The cloud management model is appealing to Morgan Stanley because managing large big data platforms on-premises is challenging. The cloud offers near-instant provisioning, flexible and reliable hardware options, near-unlimited scalability, and snapshot backups. Pivotal’s strategic direction of leveraging open source Postgres and investing in the cloud aligns with Morgan Stanley’s strategic vision.

The Morgan Stanley Greenplum plant is in the top five of the Greenplum customer footprints due to the contributions of many teams within Morgan Stanley. As our analytic compute requirements grow and evolve, Morgan Stanley will continue to leverage technology to solve complex business problems and drive innovation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.105.83