I use ledger
to keep track of my expenses and for a long time I had to manually copy in all my expenses manually from the csv output that my bank provides of my transactions. Then recently I read about ledger convert
which can take a csv file and convert it to the ledger journal format.
y bank’s csv format looks like this:
Date, Type, Description, Value, Balance, Account Name, Account Number
10/08/2018,POS,"'1062 09AUG18 C , TESCO STORES ",-11.68,6355.61,"'STUDENT ACCOUNT","'260204-20408582",
10/08/2018,POS,"'1062 09AUG18 C , SAINSBURYS S/MKTS , GB",-3.80,2367.29,"'STUDENT ACCOUNT","'260204-20408582",
10/08/2018,POS,"'1062 09AUG18 C , WINES , NTMG GB",-17.00,2371.09,"'STUDENT ACCOUNT","'260204-20408582",
10/08/2018,POS,"'1062 09AUG18 , MOBILE APP ",-30.00,2388.09,"'STUDENT ACCOUNT","'260204-20408582",
Notice that there are blank lines above and below the header row, so that is one of the first things to deal with.
I can fix the file with a few sed
commands, noting that I use the macOS version of sed rather than gnused
:
sed -i "" '/^[[:space:]]*$/d' $1
sed -i "" '/\d{4}\s\d{2}\D{3}\d{2}(\sC\s)?\s?,\s/g' $1
sed -i "" '1s/.*/date,,payee,amount,,,/' $1
The first command removed any lines that are blank or contain spaces, which gets rid of the empty header lines. The second line removes some useless filler text in the notes column of the csv file, so for instance, ,"'1062 09AUG18 C , TESCO STORES ",
gets contracted to ,"TESCO STORES ",
. Finally, the header row is replaced with headers which ledger convert
recognises.
ledger convert
needs some inputs:
--input-date-format "%d/%m/%Y"
tellsledger convert
the format of thedate
column in the csv file.--account assets:bank:student_acc
denotes the bank account the csv file is for.--rich-data
extracts unnamed columns from the csv and adds them as notes to each ledger entry.-f ~/.ledger.journal
denotes the path to the ledger journal file to use for reference.--invert
inverts the sign of the transactions, which I have to do for this style of csv to stop ledger thinking that expenses are incomes and vice versa.
Then I can wrap all of this into a neat shell script which takes inputs of the input csv and output ledger journal:
#!/bin/bash
touch $2
sed -i "" '/^[[:space:]]*$/d' $1
sed -i "" '/\d{4}\s\d{2}\D{3}\d{2}(\sC\s)?\s?,\s/g' $1
sed -i "" '1s/.*/date,,payee,amount,,,/' $1
ledger convert $1 --input-date-format "%d/%m/%Y" --account assets:bank:student_acc --invert --rich-data -f ~/.ledger.journal > $2
For now, I’ll copy in the compiled journal entries manually, I don’t want to accidentally copy over my existing journal file by writing directly to it.
It’s not a perfect system, I still have to manually fill in the type of expense in the ledger journal, but I really don’t see any way around that as my expenses don’t come from a finite list of sources, so categorising all of them would be impossible.
Update 2019_08_26
I had some problems with CSV files from one of my accounts. It turns out they were being stupid with the CSV formatting and had open double quotes which weren’t closed in a description field, meaning that any commas in that field were interpreted as column delimiters. I took it as an excuse to improve the ledger convert
bash scripts I’d set up previously. I wanted to clean them up and use some shell scripting techniques I’ve learned since last year when I wrote the original script.
#!/bin/bash
# Make temp. file
temp=$(mktemp)
# Format file depending on account
if [ "$2" = "assets:bank:student" ] || [ "$2" = "assets:bank:isa" ] || [ "$2" = "assets:bank:res" ]
then
# Prepare file
## Remove blank lines | Replace column headers | Remove single quotes > send to temp
sed '/^[[:space:]]*$/d' $1 | sed '1s/.*/date,,payee,amount,,,/' | sed "s/'//" > $temp
date="%d/%m/%Y"
elif [ "$2" = "assets:bank:monzo" ]
then
echo "TEST"
# Prepare file
## Replace column headers | Change tag | Remove time from date > send to temp
sed '1s/.*/transid,date,amount,,,,payee,,,,,/' $1 | sed 's/general/misc/g' | sed 's/T[0-9][0-9]:[0-9][0-9]:[0-9][0-9]Z//g' > $temp
date="%Y-%m-%d"
else
echo "Choose an account"
echo " assets:bank:student"
echo " assets:bank:isa"
echo " assets:bank:res"
echo " assets:bank:monzo"
fi
# Run ledger
ledger convert $temp --input-date-format ${date} --account $2 --invert --rich-data --auto-match -f ~/.ledger.journal
# Remove temp
rm ${temp}
I combined the scripts I had for my two banks into one, hinging on an if-else statement acting on the account name given as an argument to the script. It first cleans up the csv files and saves them to a $temp
file, then runs ledger convert
with certain options and sends output to STDOUT.