Contact Us
24/7
Python BlogDjango BlogSearch for Kubernetes Big DataSearch for Kubernetes AWS BlogCloud Services

News

<< All NewsReal Python: Managing Big Data with Airflow

Real Python: Managing Big Data with Airflow

January 27, 2023

There are many tools out there that companies can use to handle the storage, transfer, and transformation of data; however, beyond a certain point, scalability becomes an issue. This leads to big data architecture that can’t keep up with the flow of data and causes loss of valuable hours on maintenance.

On Episode 142 of The Real Python Podcast, Calvin Hendryx-Parker, Six Feet Up’s CTO and AWS Community Hero, discusses a recent project that utilized Apache Airflow — along with a host of other open source and cloud native tools — to make a statewide health system’s big data architecture faster and more manageable. Calvin also touches on the upcoming 2023 Python Web Conference, which is scheduled for March 13-17.

"We've historically been a Python shop since our inception 23 years ago. Our most recent demand we've seen is around big data pipelines," Calvin says. “If you need to get a massive amount of data into a single spot, that's where these big data pipelines come into play.”

Every day, hospitals need to aggregate petabytes of data to help make life-changing health and business decisions. In order to make it easier for the health system to handle the flow of data and manage the system that oversees it, Six Feet Up developers devised a clever technique that allows the pipeline to quickly scale up to meet demand. You can read about that technique in the blog post “Too Big for DAG Factories?”

"Now a data engineer doesn't have to be a full blown super senior python developer to be able to import a new dataset into the data warehouse,” Calvin says.

Check out “Building a Big Data Pipeline with Cloud Native Tools” for more on the project.

In the episode, Calvin discusses:

Listen to Episode 142: “Orchestrating Large and Small Projects With Apache Airflow”

The Real Python Podcast is a weekly podcast hosted by Christopher Bailey featuring interviews, coding tips and conversation with guests from the Python community.

How can we assist you in reaching your objectives?
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.