new data scientist advice

Advice for New Data Professionals

If you’re reading this as an aspiring (or budding) data analyst, data engineer, or data scientist, welcome! You’re in for a heck of a ride if you’re committed, and it’s always nice to meet new data people. There are a lot of things you will need to learn. That’s especially true if you love self-punishment and want to follow me down the path I took and decide to alternate between all three data roles. Some might call that the “full-stack” data path. Others might call it stupid.

Regardless of what path(s) you choose, there are some bits of advice that I wish someone had given me when I was getting started. It would have made my formative years as a data science noob so much easier. Even if you’re not new to data, this is advice that I think you’ll find useful. It will keep you relevant, mentally-sharp, and adaptable. After all, “change” is a constant companion to our line of work, and this advice will help you keep pace.

Know How Things Work Instead of How to Push Buttons

If you’ve been hitting Googles with queries like “most important data skills,” you’re likely used to seeing advice that looks something like this:

  • “Learn R.” 
  • “You should focus on ALL the Apache products.”
  • “Honestly, everything is AWS these days.”
  • “Machine learning, best learning.”

Yes, such things will be necessary for your work. The diversity of things you’ll use throughout your career is boggling. You should certainly learn how to use them, but you should approach your learning in a specific way.

Here’s the single most important piece of advice I wish I was given as a new data analyst/engineer/scientist:

It’s critical to learn the underlying mechanics of your tools, why/when to use them (specifically), and how to adapt them when situations aren’t clear cut.

Data advise from some wise data person

You Want to Be Ready for Anything

Let’s say you’re remodeling your living room. It’s been gutted so that only the four exterior walls remain. Being a savvy builder, you’re prepared. You have your blueprints, materials, and a bunch of tools on hand.

The first step is to frame out the new interior partitions. What do you do? Well, you probably go to your toolbox and grab your measuring tape, levels, guides, and a pencil. Next, you mark all the necessary cuts and guidelines. After that, you get your chop saw and cut the framing wood. You see where this is going.

But what if your chop saw breaks midway through your cuts. You’ve only ever accomplished this framing task with a chop saw. Fortunately, you’re a well-informed builder. You not only know how to use your tools, you understand their strengths and weakness, and how to adapt them when things don’t go as planned. 

Remembering that you have a handheld circular saw in your toolbox, you go grab it. You then assemble a simple guide rig from some 2×4 pieces and a pair of clamps, strap it all to a table, and voila! You’re back in business and can make all the clean cuts you need. 

A Good Data Professional Isn’t Tethered to Specific Tools

Builder-you was able to tackle the builder problem because you had a variety of fundamental tools in a toolbox at your disposal. You were also able to recognize what tools to use to solve specific problems.

When you knew you needed to measure lengths, you reached for a measuring tape and pencil. When you knew you needed to cut some framing wood, you went to the chop saw. Even more important, when things didn’t go as planned, you had the knowledge to immediately adapt and improvise.

A good data professional should never become tethered to a specific platform or one specific way of doing things.

Data advise from another wise person

A Real-World Example

When I transitioned to my latest role, all the vendor platforms were new to me. I went from ticketing and fundraising data systems to a suite of tools designed to manage education data. Such industry shifts are common in data science work so you need to be prepared.

The software used by one organization may not be the software used by your next one. Even if you’re lucky enough to transition into identical platforms, remember that vendor selection is very political. You may be stuck with ineffective platforms because some misguided executives got your team stuck in a contract or picked something to align with a specific consultant’s…”recommendation.”

My new employer was actually really anxious about this unfamiliarity. How could I efficiently extract data from a system I’d never used before? It apparently took hours to pull even simple reports from that software. I wasn’t worried. I knew that regardless of how a system looks on the outside, everything is pretty much the same under the hood.

Data Systems Typically All Look the Same Under the Surface

Almost all data platforms you will encounter are built using just a handful of common components. On the surface, they have a shiny UI’s that make them “stand out from the rest”, but below the surface, they are often remarkably similar. Side note: even those UIs are often built on the same frameworks with React, Angular, and Vue making up the lion’s share of the market right now.

That’s a powerful realization. It means that if you can understand the core languages, methods, and architectures that are common throughout the data software ecosystem, you have the keys to the kingdom!

If you can understand the core languages, methods, and architectures that are common throughout the data software ecosystem, you have the keys to the kingdom!

Yet more advise from the data oracles

For example, once you understand SQL, you’re equipped to understand the different dialects of SQL (e.g., PostgreSQL, T-SQL, and MySQL). After you understand those, you can interface with just about any database currently in use worldwide.

Likewise, if you understand Python and Javascript, you’ll understand the underpinnings of all the great stuff built on top of them. That unlocks things like React, Selenium, Puppeteer, Apps Script, Node.js, Django, … the list goes on.

It’s Important to Actually Understand What is in Front of You

Notice that I keep using the word “understand” instead of “learn.” You aren’t just picking up some books or speeding through a course on Codeacademy. You’re taking time to learn when/why to use specific tools and how best to use them. You’re becoming aware of their strengths and limitations.

If you master this approach, you’ll be dramatically less anxious about any data platform you encounter. Better yet, once you understand “all the rules,” you’ll also understand when you should break them.

New Jobs Won’t Be Stressful, They’ll Be Exciting

Because I had that kind of knowledge, the unfamiliar UIs at my new job didn’t hold me back. I was actually able to ignore them entirely and connect straight to the databases behind them. I churned out datasets in milliseconds on Day 1 that used to take a dedicated staffer an hour to complete using the systems’ clunky UIs. 

Fast forward six months and we had build custom full-stack data applications, pipelines, ELT utilities…the works. We could build such applications and integrations quickly and correctly because of knowing how things work and when/why to use them.

If all you understand is the front-end layer of your platforms, you’re a slave to that vendor. You will also keep paying for unnecessary systems or, worse, get stuck in completely inefficient and error-prone workflows.

If all you understand is the front-end layer of your platforms, you’re a slave to that vendor. You will also keep paying for unnecessary systems or, worse, get stuck in completely inefficient and error-prone workflows.

More wise data musings

A caveat about those UIs

You should allocate time to learn about those UIs even if you never use them day-to-day. However, learn them back-to-front. Start with the underlying mechanics of the software and work up to the pretty top layer. 

If you do that, you’ll grasp how all the pieces fit together, recognize where the trouble points are, and be able to adjust workflows accordingly. You’ll also be able to better understand the issues your front-end users are experiencing and be able to actually provide informed guidance.

Understand Data Mechanics, Assemble Your Tools, and Take on the World

I hope I got my point across. Assuming any of that stuck, the next time you’re perusing Medium articles or throwing money at some Udemy data magician, start looking at things a little differently. Each time you encounter a new data platform, framework, language, etc., be sure you dedicate time to answering the following questions:

  • How does it work?
  • When and why should I use it?
  • How does it interact with the other parts of the data ecosystem it operates in?
  • How could I adapt it to my specific use cases?

Those are the keys, my new data friend. Now, stop reading, start building your toolbox, and do great things.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.