In this recipe, we examine data manipulation using SQL. Spark's approach to provide both a pragmatic and SQL interface works very well in production settings in which we not only require machine learning, but also access to existing data sources using SQL to ensure compatibility and familiarity with existing SQL-based systems. DataFrame with SQL makes for an elegant process toward integration in real-life settings.