DataFusion is an extensible query planning, optimization, and execution framework, written in Rust, that uses Apache Arrow as its in-memory format. Features: - SQL query planner with support for multiple SQL dialects - DataFrame API - Parquet, CSV, JSON, and Avro file formats are supported natively. Custom file formats can be supported by implementing a `TableProvider` trait. - Supports popular object stores, including AWS S3, Azure Blob Storage, and Google Cloud Storage. There are extension points for implementing custom object stores. Use Cases: DataFusion is modular in design with many extension points and can be used without modification as an embedded query engine and can also provide a foundation for building new systems. Here are some example use cases: - DataFusion can be used as a SQL query planner and query optimizer, providing optimized logical plans that can then be mapped to other execution engines. - DataFusion is used to create modern, fast and efficient data pipelines, ETL processes, and database systems, which need the performance of Rust and Apache Arrow and want to provide their users the convenience of an SQL interface or a DataFrame API.