© Malathi Mahadevan 2018
Malathi MahadevanData Professionals at Workhttps://doi.org/10.1007/978-1-4842-3967-4_20

20. Kirsten Benzel

Senior Database Engineer, SurveyMonkey
Malathi Mahadevan1 
(1)
Raleigh, NC, USA
 

../images/463664_1_En_20_Chapter/463664_1_En_20_Figa_HTML.jpg Kirsten Benzel is a mischievous senior database engineer who has been working for SurveyMonkey since 2014. Prior, she enjoyed the deserts of Arizona while working on-site for GoDaddy, and then found the Bay Area also to her liking while she worked remotely for several years. She became fascinated with computers at an early age. When she’s not gaming, you can usually find her gleefully tuning a query or wandering a beach looking for sea glass. Her career in information technology began when she was unable to find employment as a town philosopher after earning her BA in Philosophy from Northern Arizona University. She became visible in the SQL community while twice administrating Argenis Without Borders, a fundraiser for Doctors Without Borders that ended during the annual PASS Summit. She can be reached on Twitter at @cybersnark.

Mala Mahadevan: Describe your journey into the data profession.

Kirsten Benzel: I grew up with computers and for that I’m lucky. I remember eight-inch floppies and the days when a file name could only be eight characters! It’s because of this early exposure that later on as an adult I was able to successfully launch into IT. I think early, hands-on access to computers is absolutely critical in removing the intimidation factor that so often steers young people away from a career in IT.

After two semesters in college, I chose my major by reviewing the courses I had completed and identifying correlations between the ones that I enjoyed—all very data-driven. That led me to a BA in Philosophy with some HTML, CSS, and C# on the side. I admit to being a terrible “traditional” programmer even then, and the advice I was given early on wasn’t helpful in finding a better path forward. If you had asked the college version of me what my plans were, I would have replied that I knew I needed to learn programming languages like C, C++, SQL, Python, and Perl. I had no idea which way to go nor any idea what that list even had in common! My advice to more seasoned advisors would be to learn what your protégé has tried, then learn their strengths, and from there recommend just one or two similar technologies that could help them into a general area they might enjoy.

Mala: Describe a few things you wish you knew when you started your career that you know now and would recommend newcomers to this line of work know?

Kirsten: Luck plays a big role. I started out as a call center representative knowing exactly nothing and worked hard over the span of five years to become a database developer. It’s because of luck that I landed in a company that promoted internal hiring and opened doors. But it’s hard work and the ability to soak up information that ultimately got me the role.

No one gets into database work on purpose, and it’s a shame! Colleges teach rudimentary SQL usually as some auxiliary to something else, and that’s it. There’s a reason you see a slew of Accidental DBA classes in the wild. I wish I had known SQL was “a thing.” It’s really up to us—industry professionals—to shoulder the task of educating colleges and parents on its very existence and necessity.

I’ve always loved Excel, data, making correlations, and drawing conclusions. We’re doing a disservice to young people just like me who are blind to this as even being a career option. You’ll often hear about the lack of women in STEM [science, technology, engineering, mathematics] , and I think these two issues have the same root cause: a lack of early exposure. Have your child build a website, learn what a web server is, and spin up a little MySQL database on the back end as their class science project. Make a database of frogs, or horses, or dinosaurs, and display them on a basic site. It might seem intimidating in print for me to just haul off and say that, but we literally have a world of tutorials at our fingertips with a single click.

When I’m talking to someone who already has an interest in database work, I always recommend “the book”—T-SQL Fundamentals by Itzik Ben-Gan [Microsoft Press, 2016]. This is my way of passing along some of the best advice that I was given by our senior DBA, Pete, who said, “Read anything by Itzik Ben-Gan.” That advice is still rock solid even now, ten years later.

Mala: What was analytics/data visualization like five years ago? What has changed, and why is it such an “in” thing now?

Kirsten: Five years ago, we didn’t have the raw data that we have today. I realize this might seem like I’m stating the obvious, but when you consider the difficulty of signal to noise and then trying to visualize it and explain it to an audience who has a need to know, it can be daunting. We joke in the industry about the term big data and what it means. I like the definition of a course on Amazon Redshift that I just completed. “Big data is high-volume, high-velocity, and/or high variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.” Data visualization is not easy, and as our data sets grow and change, so will the tools advance and the need for experts grow. It’s a promising area, especially if coupled with a solid foundation in statistics.

Mala: What are some ideas one can present to management as a BI person as far as analytics, big data, and buzzwords like that go?

Kirsten: Buzzwords are a personal pet peeve of mine! Every time I hear the word ask used as a noun, I cringe. At one point at work, we were renaming all our conference rooms and I suggested we use buzzwords and phrases. Take It Offline, All Hands-on Deck, On Your Plate, Outside the Box, Close the Loop, The Next Level, Core Competency, Window of Opportunity, Low-Hanging Fruit. I made a list of thirty-plus! My recommendations weren’t considered, but boy did we have a lot of fun making the list.

I think imprecise language is annoying at best and damaging at worst. Let me give you a real-world example. A common metric that many companies use is engagement. On the surface, this seems like an obvious thing. It’s how much users engage with the product. But when you’re trying to measure it and present, say, a line chart of “engagement,” what exactly does that look like? You have to go much, much further and define the metrics you want to measure, whether that’s logins, page visits, page clicks, page hovers, or purchases. To tell my VP that “engagement fell off” is alarming! And much less meaningful than perhaps, “Starting seven days ago, page clicks to the English version of the billing page fell off by fourteen percent.” I don’t personally use buzzwords, and I view speaking properly, exactly, and directly as an exercise in ethics. You should read La Parole by Georges Gusdorf [Northwestern University Press, 1979] if language and ethics interests you. Precision of language is key, as is adding as much relevant context to a statement as possible, which leads into the next section on storytelling.

Mala: Some people think storytelling means weaving stories around data. Data should just present facts. What is your take?

Kirsten: I’ve never heard that data should be presented as a story. My first inclination when I hear this is to ask why? In my current and previous roles, it’s been my job to present accurate, timely, raw data. Just facts. Maybe the story-thing is relevant to analysts who have to answer why.

In my opinion, if I do my job right, then it should make the “intermediary why” easier. What do I mean by this? Tell me why orders for appletinis dropped off starting last Tuesday. Well, if I show you an unsorted list of our martinis, you might just smile and nod. If instead I show you the same list but sorted by the primary flavor with the two flavors that are out of stock crossed out at the top, you can now very easily know why I’ve brought this to your attention. The “intermediary why of “Why is the apple flavor out of stock?” directly relates to the why of “Why did orders drop off?” It’s a crude example, but you get the gist of it.

The quality of how I display data to our analysts is, I think, just as important as how fast and how often it’s accessible. Smart BI highlights anomalies. It’s perhaps not my job to tell the story, but I can try to point the team in the direction of an interesting one if I’m doing it right.

Mala: What is your experience with agile methodologies and business intelligence?

Kirsten: I’ve used agile at both companies that I’ve worked in IT for—GoDaddy and SurveyMonkey. I love the idea of sprints, and when managed properly, they’re incredibly useful. On the other hand, two of the most difficult things in IT are estimating the time it’s going to take to do something and naming things. Go ahead and laugh at the naming thing, but you’ll see!

My general rule is to make a time estimate guess and then triple it. If I think something will take me five minutes, it will probably take fifteen. A one-day project should be blocked off as three. If I may slice and dice a few clichés: over-estimate and just deliver.

I think it’s harder for a BI team to use agile than other branches of the organization. The reason being is that BI or data teams are in thrall to most other teams. We can’t create a report until operations deploy the feature. We can’t analyze the A/B test until the data comes in. Meanwhile, we’re being asked questions from project managers and marketing, and all the while infrastructure is trying to optimize and improve processes, and code, and reports under our feet. When you try to put all those moving parts into neat, tidy, little two-week chunks, you need to have a top-notch team or be excruciatingly patient with scope creep.

Mala: Describe your experience with cloud adoption .

Kirsten: The cloud is another one of those buzzwords that really ruffles my feathers. Someone funnier than me said that the cloud is just “somebody else’s computer” and I prefer that definition. There’s nothing spooky about it. You’re literally just loading your data somewhere else and processing it there instead of here, on premises.

My experience is limited but that’s changing fast. As I said earlier, I just finished a course on AWS Redshift. The cloud is fantastic for distributed processing. What’s that? Well, I really like analogies. If you can imagine a basket full of apples that need to be peeled, and then imagine one person peeling one apple at a time, you can see where that might be suboptimal—a favorite term at work! Instead, give ten people each a handful of apples and yell, “Go!” That’s distributed processing in a nutshell.

If you group your data into files—yes, just like the .txt files you see on a consumer grade computer—split out by month, then you can have many threads—thread equals apple peeler—chew through your data all at once in order to answer a date-related question. It’s much faster, but it means you must choose carefully how you physically slice your data.

Distributed processing is pretty neat, but there’s a lot of factors to consider before you choose the cloud as your poison of choice. The how and why of that fall well outside this chat, but hopefully, I’ve at the very least generated some moderate interest in using cloud computing.

Mala: What are some of the common data quality issues when dealing with data? What can be done to avoid them or to mitigate their impact?

Kirsten: End users and data have something in common: if they can, they will. No, really. The things I have seen people do and the data that’s gone by on my screen just boggles the mind. When you’re wearing your QA hat, you must try and do every possible thing that an end user might do. Yes, someone really will paste the entire contents of a book into the text field that dev didn’t put a limit on so you better be ready. Because of my experience—or maybe my lack of imagination?—I really prefer the whitelist approach over the blacklist approach when it comes to dealing with data integrity.

When I’m designing an ETL or a new database, I choose the correct datatypes and data lengths based on what should be allowed. This is called the whitelist approach. I’m giving the okay or “whitelisting” allowed behaviors and values. Why this approach? If you take the blacklist approach and only define what isn’t allowed, you will be forever lost in the Dungeon of Guess What Bad Data Came in Today. Roll fortitude. Your life will be an endless list of rules that you have to keep updating every time you encounter a new violation. And that’s just suboptimal.

What does this actually look like? Some examples: no leading or trailing spaces. An email address must have an at sign and a dot to be valid. An IP address must have four dots or at least two colons to be valid with a maximum length of thirty-nine for a fully enumerated IPv6 address. Use NVARCHAR for anything an end user is going to enter so you don’t HORK the data. Use Python to do string matching because RegEx options in T-SQL are scarce. Use the most current time type that you can because Microsoft improves them every time. Right now, that’s datetime2. And last but definitely not least, provision your servers to use UTC as the default and store your timestamps as UTC. Trust me on that last one.

Mala: What are some of your favorite tools and techniques?

Kirsten: I did a presentation on how to go from new hire to deployment complete in one day or less. One of my tricks is to have three programs installed: Softerra LDAP browser, Sublime Text, and Redgate SQL Compare.

Why do you need an LDAP browser? Because when you’re first starting out and even after you’ve been doing the job for a decade, you’re going to get asked Active Directory questions. It’s going to make your IT department so much happier if you tell them what group you need to be added to after you’ve confirmed that it actually exists and you’ve spelled it right. Plus, “Why can’t I see this database? I used to have it!” is at a minimum a weekly question you’ll get from users, and you’ll want a quick way to verify or compare AD groups.

Notepad is evil. Use it for notes that you don’t care about. The big kids use Sublime Text! It does so much. It’s color-coded, and you can compare documents side by side, to name but a few reasons why. Also, did you know that in Notepad [the default text editor installed on a Windows machine] when it’s doing a find-and-replace it returns to the top of the document after each iteration? So suboptimal. Although to really level up, you’ll want to become a Vi/Vim master.

Last, grab something to compare database objects. Choose a program that doesn’t lock the schema as it works. From day one, you’re going to be comparing database objects and occasionally the data in small lookup tables, so get one of these, and get it quickly. I like Redgate Compare.

Mala: What is the role of documentation in being a good BI/analytics person?

Kirsten: A wise person once told me that “as soon as you write it down, it’s obsolete.” There’s some brutal truth to this, but it doesn’t mean you shouldn’t document anything.

Documentation is critical to the success of your team. If you skip to the question about work/life balance, you’ll see how important camaraderie is, which reflects the importance of documentation. In an ideal world, tribal knowledge would be a thing of the past. I think perhaps it’s best to shoot for a happy medium. If you can hand a new hire your documentation and have it get them somewhere around one-third of the way to “up to speed,” then you’re doing pretty good! At a minimum, you should keep a data dictionary for all the metrics and slang and the endless, company-specific acronyms. Network diagrams never hurt. Common workflows and ETLs, and of course, it’s nice to have an onboarding where-to-find-it document.

I just recently started using Alation, and I couldn’t love it more. Not only does it automatically pull in database schemas and stored procedures but you can easily create really slick-looking documents and link them to code snippets and database objects. Sure, this sounds like a sales pitch, but maybe it is? I’ve always personally struggled with documentation as a chore, but I’m finding I actually look forward to using Alation. That’s high praise.

Some people hoard information either by accident or as a misguided job security technique. Keep in mind that if they cannot replace you, they cannot promote you. If no one else can do your job because there’s been zero knowledge transfer, then you’ll be on the hook 24/7/365, and that’s not healthy for you or the company.

Mala: What are your favorite books, blogs, or other means of learning?

Kirsten: I’m a huge fan of RSS feeds for keeping up with technology. Another good resource is the free SQLSaturday that PASS sponsors. If you don’t know what a SQLSaturday is, you’re missing out. These events are where the industry experts—the movers and shakers, if you will—present talks and classes, every Saturday, to help promote community and to help educate attendees. Find your local chapter.

Last, I like Twitter as a means of keeping my thumb on the pulse of the industry. I love how it’s an easy way to keep up with everything at once, and I consequently hate how sometimes it’s like trying to drink from a firehose. Learn to create lists and keep the people you follow in tidy groups so you don’t get overwhelmed or succumb to the brutal context switching that your main feed consists of.

Mala: What are your recommended ways of stress management and developing healthy work/life balance? How do you handle long grinds on the job? How do you work long hours that are boring and not challenging while still keeping motivated?

Kirsten: I struggle with this because I’m a work “sprinter” and not a “marathoner.” If you’re like me, you go on four-to-six-hour benders and never take the headphones off or leave your chair. Well, the facts are in, and this is disastrous for your health both physical and mental. I’ve just recently created alerts on my phone to remind me to stand up and stretch hourly and to eat something every three hours. I’m a recent convert to Bulletproof coffee and I’m a big fan of Vital Proteins Collagen Creamer and organic maple syrup in lieu of the old traditional sugar and creamer. Add a dash of cinnamon, and you’ve got delicious brain food right in your hand.

Sleep is tough, too. When you’re under deadline pressure, it’s almost impossible to rest sometimes. Deep breathing gets me to sleep if nothing else works. For those wicked on-call rotations, I’d recommend setting up a buddy system—this is where documentation is key—so once you’ve had two or three bad nights, you can hand over the pager to a buddy so you can recover. Similarly, keep an eye on your teammates, yank the pager from them, and cover if they’ve been suffering for a few days and are too tired to notice. It’s a give and take.

Being constantly connected is probably the worst part of working in tech. I cannot count the number of times I get an envious response when I tell people that I can work from home, because what they don’t realize is that the laptop is not freedom, it’s a tether. It’s up to you to set boundaries, and I really struggle with this, too. If I wake up at three a.m., bored, I’ll often hop on and work without the usual interruptions. It’s up to me to then clock out at a reasonable time and not overdo it. Otherwise, my boss jokingly threatens to yank my AD permissions!

Another thing that’s relatively new is the concept of unlimited vacation. It’s one of those deceptive things that sounds good on the surface but sometimes has hidden teeth. With the introduction of unlimited vacation, the burden of how many days off is appropriate has been moved from the company to the employee. Now, if you are perceived to take off too much time, it can have a negative impact on how your work ethic is regarded. Again, managing this in a healthy way is up to you. You must take time off to disconnect and recuperate, or even to take care of family needs, and the best balance is going to be different for each person. Don’t feel guilty for taking personal time to recharge.

I can’t finish a section on work/life balance without addressing the B word—burnout. It’s real. It’s prolific. And if you don’t set boundaries, it’s going to bite you really hard and really fast. Once you’re burned out, I honestly don’t know how you recover, so try not to go there. I have a slide saved on my phone with the six causes of burnout by Cate Huston:
  1. 1.

    Lack of control

     
  2. 2.

    Insufficient reward

     
  3. 3.

    Lack of community

     
  4. 4.

    Absence of fairness

     
  5. 5.

    Conflict in values

     
  6. 6.

    Work overload

     

I’m lucky enough to work in a great company and on a superb team that makes one through five a non-issue for me. Work overload is tougher. As database professionals, there is always more work to do. In this job, you never, ever clear the queue. And somehow you each individually have to make peace with that. Again, disconnecting is key, even if you choose to just sit at home and stare at the wall! If you feel burnout creeping in, try to reach out. If you cannot, it may be time to look for a place with values that more closely mirror your own. Remember, not all matches work out and that’s okay.

One last thing. It’s not about how you fail, it’s how you recover from failure. Be transparent, take ownership, fix it quickly, and learn from it.

Mala: Describe your style of interviewing a data professional. What do you look for, and what are some examples of questions you ask?

Kirsten: I conduct two types of interviews—technical and non-technical. The technical interview is a list of technical questions that we as a team have created for the open role. My non-technical interview, on the other hand, is an hour of talk. I’ll share my top-two favorite questions to ask.

Pretty sure I stole this one from Brent Ozar. I like to ask the interviewee to tell me about something they’ve recently done, a project, that they’re proud of. It doesn’t even have to be SQL related. I’m looking for that spark, and to start up a dialog where they genuinely want to share with me something cool. Interestingly enough, you cannot fail this question, and it’s led to some really quite interesting conversations that I’ve enjoyed.

Second, this one I borrowed from a TED Talk by Adam Grant on givers and takers—highly recommended! I ask them to list three to five people whose career they have influenced. The ideal answer will be a handful of people in positions equal or lower to them, who they have mentored.

Mala: Can you narrate a funny or an interesting story to share with our readers?

Kirsten: A man lived in a poor area in a small home with a dirt yard. Every morning he’d watch from the window as a neighbor dumped trash into his yard. He never said anything and quietly cleaned it up, day after day. One day, the neighbor did not appear with the trash. The man hurried next door to make sure his neighbor was okay.

Even if you have to keep cleaning up messes, remember what’s important.

Key Takeaways

  • Precision of language is key, as is adding as much relevant context to a statement as possible is important.

  • Smart BI highlights anomalies. It’s perhaps not my job to tell the story, but I can try to point the team in the direction of an interesting one if I’m doing it right.

  • End users and data have something in common: if they can, they will.

  • Keep in mind that if they cannot replace you, they cannot promote you.

  • It’s not about how you fail, it’s how you recover from failure. Be transparent, take ownership, fix it quickly, and learn from it.

Favorite tools: Softerra LDAP browser, Sublime Text, Redgate SQL Compare, Alation

Recommended books: T-SQL Fundamentals by Itzik Ben-Gan, La Parole by Georges Gusdorf

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.144.84