If you were to ask a random sampling of people what data analysis is, most would say that it is the process of calculating and summarizing data to get an answer to a question. In one sense, they are correct. However, the actions they are describing represent only a small part of the process known as data analysis.
For example, if you were asked to analyze how much revenue in sales your company made last month, what would you have to do in order to complete that analysis? You would just calculate and summarize the sales for the month, right? Well, where would you get the sales data? Where would you store the data? Would you have to clean up the data when you got it? How would you present your analysis: by week, by day, by location? The point being made here is that the process of data analysis is made up of more than just calculating and summarizing data.
A more representative definition of data analysis is the process of systematically collecting, transforming, and analyzing data in order to present meaningful conclusions. To better understand this concept, think of data analysis as a process that encapsulates four fundamental actions: collection, transformation, analysis, and presentation.
Collection. Collection encompasses the gathering and storing of data—that is, where you obtain your data, how you will receive your data, how you will store your data, and how you will access your data when it comes time to perform some analysis.
Transformation. Transformation is the process of ensuring your data is uniform in structure, free from redundancy, and stable. This generally entails things like establishing a table structure, cleaning text, removing blanks, and standardizing data fields.
Analysis. Analysis is the investigation of the component parts of your data and their relationships to your data source as a whole. You are analyzing your data when you are calculating, summarizing, categorizing, comparing, contrasting, examining, or testing your data.
Presentation. In the context of data analysis, presentation deals with how you make the content of your analysis available to a certain audience. That is, how you choose to display your results. Some considerations that go along with presentation of your analysis include the platform you will use, the levels of visibility you will provide, and the freedom you will give your audience to change their view.
As you think about these four fundamental actions, think about this reality: most analysts are severely limited to one tool—Excel. This means that all of the complex actions involved in each of these fundamentals are mostly being done with and in Excel. What's the problem with that? Well, Excel is not designed to do many of these actions. However, many analysts are so limited in their toolsets that they often go into hand-to-hand combat with their data, creating complex workarounds and inefficient processes.
What this book highlights is that there are powerful functionalities in Access that can help you go beyond your one dimensional spreadsheet and liberate you from the daily grind of managing and maintaining redundant analytical processes. Indeed, using Access for your data analysis needs can help you streamline your analytical processes, increase your productivity, and analyze the larger datasets that have reached Excel's limitations.
Throughout this book, you will come to realize that Access is not the dry database program used only for storing data and building departmental applications. Access possesses strong data analysis functionalities that are easy to learn and certainly applicable to many types of organizations and data systems.
After reading the first three chapters, you will be able to demonstrate proficiency in Access, executing powerful analysis on large datasets that have long since reached Excel's limitations. After the first nine chapters, you'll be able to add depth and dimension to your analysis with advanced Access functions, building complex analytical processes with ease. By the end of the book, you'll be creating your own custom functions, performing batch analysis, and developing automated procedures that essentially run on their own. You'll also you will be able to analyze large amounts of data in a meaningful way, quickly slice data into various views on the fly, automate redundant analysis, save time, and increase productivity.
It's important to note that there are aspects of Access and data analysis that are out of the scope of this book.
While this book does cover the fundamentals of Access, it is always in the light of data analysis and it is written from a data analyst's point of view. This is not meant to be an all-encompassing book on Access. That being said, if you are a first-time user of Access, you can feel confident that this book will provide you with a solid introduction to Access that will leave you with valuable skills you can use in your daily operations.
This book is not meant to be a book on data management theory and best practices. Nor is it meant to expound on high-level business intelligence concepts. This is more of a "technician's" book, providing hands-on instruction that introduces Access as an analytical tool that can provide powerful solutions to common analytical scenarios and issues.
Finally, while this book does contain a chapter that demonstrates various techniques to perform a whole range of statistical analysis, it is important to note that this book does not cover statistics theory, methodology, or best practices.
In order to get the most out of this book, it's best that you have certain skills before diving into the topics highlighted in this book. The ideal candidate for this book will have:
Some experience working with data and familiarity with the basic concepts of data analysis such as working with tables, aggregating data, and performing calculations
Experience using Excel with a strong grasp of concepts such as table structures, filtering, sorting and using formulas
Some basic knowledge of Access; enough to know it exists and to have opened a database once or twice
The following sections discuss this books structure and what it has to offer.
Part II, which includes Chapters 1, 2 and 3, provides a condensed introduction to Access. Here, you will learn some of the basic fundamentals of Access, along with the essential query skills required throughout the rest of the book. Topics covered in this part are: relational database concepts, query basics, using aggregate queries, action queries, and Crosstab queries.
Part II introduces you to some of the basic analytical tools and techniques available in Access. Chapter 4 covers data transformation, providing examples of how to clean and shape raw data into staging areas. Chapter 5 provides in-depth instruction on how to create and utilize custom calculations in analysis. Chapter 5 also shows you how to work with dates, using them in simple date calculations. Chapter 6 introduces you to some conditional analysis techniques that allow for the addition of business logic into analytical processes.
Part III demonstrates many of the advanced techniques that truly bring data analysis to the next level. Chapter 7 introduces you to powerful subquery and domain aggregate functionality. Chapter 8 demonstrates many of the advanced statistical analysis that can be performed using subqueries and domain aggregate functions. Chapter 9 provides you with an in-depth look at Access macros and how to schedule batch data processing. Chapter 10 not only shows you how to use SQL and VBA to run data analysis without queries, but also how to create your own custom functions.
Part IV focuses on building reports and visualizations using Access. In Chapter 11, you will cover the basics of turning data into a slick-looking PDF-style Access reports. The chapter also talks about creating charts in Access to enhance the look and feel of Access reports. Chapter 12 discusses the real-world benefits of using the built-in PivotTable and PivotChart functionality found in Access. Chapter 13 demonstrates some of the innovative ways you can implement dashboard-style visualizations in your Access Queries and Reports.
Part V turns your attention to automation and integration, showing you how your reporting mechanisms can be enhanced by leveraging other programs and platforms. Chapter 14 discusses the various ways to move data between Excel and Access using VBA and ADO. Chapter 15 focuses on using Excel and Access automation to manage the inevitable need to show parts of your reporting through Excel. In Chapter 16, you're introduced to the automation techniques, which allow Excel and Access to take control of one another, resulting in some interesting reporting options. In Chapter 17, you get a thorough introduction to XML including a detailed explanation of how XML can collect and transfer data. You conclude with Chapter 18, where you'll get a taste of some of the techniques you can use to integrate Excel and other applications in the Microsoft Office suite. Here, you will be shown how to perform the most common tasks in more efficient ways through integration.
Part VI includes useful reference materials that will assist you in your everyday dealings with Access. Appendix A provides a high-level overview of VBA for those users who are new to the world of Access programming. Appendix B introduces SQL, offering a concise tutorial on SQL syntax and usage. Appendix C offers ideas on how to avoid performance and corruption issues when working with Access databases. Appendix D details many of the built-in Access functions that are available to data analysts.
To help you get the most from the text and keep track of what's happening, we've used a number of conventions throughout the book.
These hold important, not-to-be forgotten information that is directly relevant to the surrounding text.
Notes, tips, hints, tricks, and asides to the current discussion are presented like this.
As for styles in the text:
We highlight new terms and important words when we introduce them.
We show keyboard strokes like this: Ctrl+A.
We show file names, URLs, and code within the text like so: persistence.properties.
We present code as follows:
We use a monofont type for code examples.
3.133.147.158