Solving CSV Encoding Problems
Overview
There are cases where you aggregate data in a database on a Linux server
and produce a CSV file as a report.
I created a CSV file on a Linux server,
attached it to an email, and sent it to Windows and Mac users,
but on both platforms the CSV file came out garbled when opened.
I investigated to solve this problem.
Why does it get garbled in the first place?
On Windows and Mac, CSV files are generally opened by launching Excel,
but Excel tries to open them as Shift_JIS by default.
Some blogs suggest a workaround of opening the file in a text editor first, copying the content, and pasting it into Excel,
but when the recipient is a client, or when the file is very large,
an approach that requires extra manual steps is a no-go.
Investigation 1: Convert the character encoding and then attach and send via mutt
- For character encoding, I used nkf : Network Kanji Filter Version 2.0.7 (2006-06-13)
- For sending mail, I used mutt 1.4.2.2i
- I tweaked mutt’s configuration file, but it didn’t work out.
Shift_JIS
1 | $ echo '大崎,yoshi,浜田,moto,松本' > sjis.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g sjis.csv |
Huh? I encoded it as Shift_JIS before sending, but it became UTF-8.
JIS (ISO-2022-JP)
1 | $ echo '大崎,yoshi,浜田,moto,松本' > jis.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g jis.csv |
It was sent without the character encoding being changed, staying as ISO-2022-JP, but…
it still came out garbled…
UTF-8
1 | $ echo '大崎,yoshi,浜田,moto,松本' > utf8.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g utf8.csv |
Garbled, as expected…
UTF-8 with BOM
1 | $ echo '大崎,yoshi,浜田,moto,松本' > utf8-bom.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g utf8-bom.csv |
Same result as JIS…
EUC
1 | $ echo '大崎,yoshi,浜田,moto,松本' > euc.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g euc.csv |
Changing the file encoding didn’t work.
Investigation 2: Try making it a BINARY file
To be more specific, try sending a compressed file.
Since the CSV is opened as Shift_JIS, I encode it as Shift_JIS.
1 | $ echo '大崎,yoshi,浜田,moto,松本' > sjis.csv |
- Receive the email, download the attachment, and check the character encoding
1 | $ nkf -g sjis.zip |
It downloaded while staying as Shift_JIS!
This looks promising!
It worked!
Summary
- I was able to open the CSV files sent to Windows and Mac without any garbled text.
- Compressing the file also reduced its size, making the transfer more efficient.

