Apache Parquet

uaoleg's profile image uaoleg posted 6 months ago in General Permalink

Amazon RDS creates dumps in Apache Parquet format, so it'll be very helpful to have an ability to work with this format

ansgar's profile image ansgar posted 6 months ago Permalink

First time I heard of that format. Are there other tools which can read such files?

uaoleg's profile image uaoleg posted 6 months ago Permalink

For example JetBrains IDE https://plugins.jetbrains.com/plugin/21701-big-data-file-viewer

ansgar's profile image ansgar posted 6 months ago Permalink

Ok, I have the impression parquet files are useful especially for really big files in general, and not specifically for SQL dumps.

Don't you think it is still most popular to write SQL dumps to a simple text file, and probably zip that file?

uaoleg's profile image uaoleg posted 6 months ago Permalink

Looks like AWS makes dumps only in Parquet format. So I'm looking for a tool that can deal with this format on my local machine.

ansgar's profile image ansgar posted 6 months ago Permalink

Is that MySQL/MariaDB on AWS? There should at least be an option to create text file dumps per command line, or?

uaoleg's profile image uaoleg posted 6 months ago Permalink

Yep, it's MySQL. Nope, unfortunately Apache Parquet only: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.html

ansgar's profile image ansgar posted 6 months ago Permalink

Ok I see.

HeidiSQL has some support for connecting to an RDS instance:

Description

So you could create dumps per HeidiSQL, and probably other tools as well. Just an idea.

To be honest - while reading the first paragraphs of Amazon's documentation on these snapshot files, I think this is beyond what HeidiSQL will support in the near future.

uaoleg's profile image uaoleg posted 6 months ago Permalink

Thanks for a discussion!

Please login to leave a reply, or register at first.