homme.io
Clean.Precise.Quick.
..
PAX ROMANA
SAKURA
Фотография
Философия
Искусство
История
C/C++
DBMS
Oracle
Спорт
Linux
Lua
IT

Infinitum.Aeterna
2024.Китай
Иран в лицах
2023.Иран
2023.06.Москва
2023.Стамбул
2023.ЗИМА
2022.11.Турция
2022.ОСЕНЬ
2022.08.Зубовка
2022.07.Турция
2022.Раменское
2022.ЛЕТО
2022.Архангельское
2022.Парк 50-летия Октября
2022.Санкт-Петербург
2022.Ярославль
2022.03.Зубовка
2022.Кокошкино
2022.Сочи
2022.ВЕСНА
2022.02.Царицыно
2022.Стамбул
2022.02.Коломенское
2022.ЗИМА
2021.Зубовка
2021.ОСЕНЬ
2021.Египет
2021.Раменское
2021.ЛЕТО
2021.Дивеево
2021.Азов
2021.02.Зоопарк
2021.Карелия
2020.Санкт-Петербург
2020.Турция
2020.Аносино
2020.Азов
2020.Верея
2020.Арктика
2020.Греция
2019.Турция
2019.Зубовка
2019.Дагестан
2019.Дагестан+
2019.Египет
2019.Италия
2019.Куликово поле
2019.Калуга
2019.02.Танцы
2019.Байкал
2018.Переславль
2018.Плес
2018.Березка
2018.Крым
2018.Азов
2018.Калининград
2018.Санкт-Петербург
2018.Эльбрус
2017.Турция
2015.Египет
2013.Египет
2013.Рим
Разное

How to Export Hive Table to CSV File (c)

If your Hadoop cluster allows you to connect to Hive through the command line interface (CLI), you can very easily export a Hive table of data in Hadoop to a CSV.

It only takes a few small lines of code, which I’ve written into a few bash/shell scripts:

Approach One (Hive Insert Overwrite a Directory):

#!/bin/bash hive -e "insert overwrite local directory '/path/in/local/' low format delimited fields terminated by ',' select * from my_database.my_table" cat /path/in/local/* > /another/path/in/local/my_table.csv"

This approach writes the contents of a Hive table to a local path (linux) in as many files as it needs. It then uses a Linux “cat&r" command to merge all files to one csv.

Here’s what happened:


Approach Two (Hive CSV Dump Internal Table):

#!/bin/bash hive -e "drop table if exists csv_dump; create table csv_dump ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\\n' LOCATION '/temp/storage/path' as select * from my_data_table;" hadoop fs -getmerge /temp/storage/path/ /local/path/my.csv"

This approach writes a table’s contents to an internal Hive table called csv_dump, delimited by commas — stored in HDFS as usual. It then uses a hadoop filesystem command called “getmerge” that does the equivalent of Linux “cat” — it merges all files in a given directory, and produces a single file in another given directory (it can even be the same directory).

In either approach, that .csv now lives on your local edge node, and can be placed into HDFS, used in other scripts, or SCP’d to your local desktop. It’s a very efficient and easy way to get the contents of a Hive table into a easily human and application-readable format.

(c) Internet

sdmrnv, 2018-12-02 [0.506ms, s]