So just how powerful and efficient is the SQL language? In the previous two sections, we explored both basic and more advanced features of SQL – all of whose functionality (in theory) could be mimicked with just Java (that is, just do a bare-bones SELECT
statement to get back the entire table and parse it with Java if
statements, and so on). However, it's time to explore if there's an actual added advantage to filtering and manipulating our data on the SQLite end (as opposed to on the Java end), and if so, how much of an advantage it provides. And so, to start, we'll need a much bigger data set to better illustrate the improvements in performance.
First, we create a new table under a new schema which simply has a column for name, state, and income – think of this as a United States database with each family's name, the state they live in, and their family income. The table has 17,576 rows – still not a lot considering the magnitude of some real application tables – but hopefully this test table will illustrate some of these performance differences. Let's begin with the WHERE
filter:
public class PerformanceActivity extends Activity { @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); TestSchemaHelper sch = new TestSchemaHelper(this); SQLiteDatabase sqdb = sch.getWritableDatabase(); // TEST WHERE FILTER PERFORMANCE // // SQL OPTIMIZED long start = System.nanoTime(); String query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { TestTable.NAME }, TestTable.INCOME + " > 500000", null, null, null, null); System.out.println(query); Cursor c = sqdb.rawQuery(query, null); int numRows = 0; while (c.moveToNext()) { int colid = c.getColumnIndex(TestTable.NAME); String name = c.getString(colid); numRows++; } System.out.println("RETRIEVED " + numRows); System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close(); // JAVA OPTIMIZED start = System.nanoTime(); query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { TestTable.NAME, TestTable.INCOME }, null, null, null, null, null); System.out.println(query); c = sqdb.rawQuery(query, null); numRows = 0; while (c.moveToNext()) { int colid = c.getColumnIndex(TestTable.NAME); int colid2 = c.getColumnIndex(TestTable.INCOME); String name = c.getString(colid); int income = c.getInt(colid2); if (income > 500000) { numRows++; } } System.out.println("RETRIEVED " + numRows); System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close(); } }
On the SQLite side, we're simply using a WHERE
filter which returns to us all families in our table that have a family income of over 500,000. On the Java side, we get back the entire table and loop through each row and use an if
statement to perform the same filtering. We can verify that the outputted rows are the same, and at the same time look at the speeds of the two methods for comparison:
So we see that here there's almost a 5x boost in performance! Next, let's take a look at the performance boost gained when using the GROUPBY
clause. On the SQLite side, we'll simply be doing a
GROUPBY
statement on the states column and will ask to count up how many families are from each state. Then, on the Java side, we'll ask for the whole table back and manually go through each row, using a standard Map
object to keep track of each state and its respective count as follows:
// TEST GROUP BY PERFORMANCE // // SQL OPTIMIZED start = System.nanoTime(); String colName = "COUNT(" + TestTable.STATE + ")"; query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { TestTable.STATE, colName }, null, TestTable.STATE, null, null, null); System.out.println(query); c = sqdb.rawQuery(query, null); while (c.moveToNext()) { int colid = c.getColumnIndex(StudentTable.STATE); int colid2 = c.getColumnIndex(colName); String state = c.getString(colid); int count = c.getInt(colid2); System.out.println("STATE " + state + " HAS COUNT " + count); } System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close(); // JAVA OPTIMIZED start = System.nanoTime(); query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { TestTable.STATE }, null, null, null, null, null); System.out.println(query); c = sqdb.rawQuery(query, null); Map<String, Integer> map = new HashMap<String, Integer>(); while (c.moveToNext()) { int colid = c.getColumnIndex(TestTable.STATE); String state = c.getString(colid); if (map.containsKey(state)) { int curValue = map.get(state); map.put(state, curValue + 1); } else { map.put(state, 1); } } for (String key : map.keySet()) { System.out.println("STATE " + key + " HAS COUNT " + map.get(key)); } System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close();
And let's see how well we did:
So we see that in this case, the performance boost was there but less noticeable, giving us a 33 percent boost in efficiency. It's important to note that these stated statistics are highly dependent on the schema and size of your tables, so take these numbers with a grain of salt. However, the goal of these little experiments is to just give us an idea of how these two methodologies compare. Lastly, let's take a look at how a built-in aggregate function like avg()
in SQL compares with Java. The code for both methodologies is as follows:
// TEST AVERAGE PERFORMANCE // // SQL OPTIMIZED start = System.nanoTime(); colName = "AVG(" + TestTable.INCOME + ")"; query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { colName }, null, null, null, null, null); System.out.println(query); c = sqdb.rawQuery(query, null); while (c.moveToNext()) { int colid = c.getColumnIndex(colName); double avgGrade = c.getDouble(colid); System.out.println("AVG INCOME " + avgGrade); } System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close(); // JAVA OPTIMIZED start = System.nanoTime(); colName = TestTable.INCOME; query = SQLiteQueryBuilder.buildQueryString(false, TestTable.TABLE_NAME, new String[] { colName }, null, null, null, null, null); System.out.println(query); c = sqdb.rawQuery(query, null); double sumIncomes = 0.0; double numIncomes = 0.0; while (c.moveToNext()) { int colid = c.getColumnIndex(colName); int income = c.getInt(colid); sumIncomes += income; numIncomes++; } System.out.println("AVG INCOME " + sumIncomes / numIncomes); System.out.println((System.nanoTime() - start) / 1000000 + " MILLISECONDS"); c.close();
And taking a quick look at what the output gives us:
And wow – enough said. The results for both methods are the same, but when using the SQL function you saw a 16x gain in performance.
18.118.23.147