Explore our expert-made templates & start with the right one for you.
Search through our Glossary
What is Amazon Kinesis? The Simple Explanation Amazon Kinesis is an Amazon Web Service designed to process large-scale data streams from a multitude of services in
What is Apache Airflow? Airflow is an open-source workflow management system designed to programmatically author, schedule, and monitor data pipelines and workflows. The open-source distribution is
What is Apache Airflow DAG? DAG stands for Directed Acyclic Graph. DAGs can be used to schedule and monitor airflow tasks. It is a collection of
What is Apache Kafka? Apache Kafka is an open-source streaming platform originally developed by LinkedIn. It was developed as a messaging queue but took on a
Apache Spark is a fast, flexible engine for large-scale data processing. More specifically, Apache Spark is a parallel processing framework that boosts the performance of big-data
Change Data Capture is not a new concept, and has been a part of database and data warehouse management for nearly as long as they have
A data lake is an architectural design pattern in big data. It is not a single product; rather, a data lake is a set of tools
A data pipeline is a process for moving data between source and target systems. Data pipelines are used to replicate, move, or transform data, or simply
What is a Data Warehouse? A data warehouse is a technology that aggregates data from operational systems and external data sources from anywhere within an organization,
A data pipeline is a process of moving data from one location to another, from source to target. A data ETL pipeline (extract/transform/load) is a data
What is Spark Streaming? Apache Spark Streaming is an extension of the core Apache Spark API, a distributed general-purpose cluster computing framework that natively supports both
Explore our expert-made templates & start with the right one for you.