Choosing CSV line-endings with {readr}
Can’t believe it has been four months since the last post. I’ve been lucky enough to be quite busy with the fall-out of Covid-19, and the sketches for blogposts have just been piling up with no opportunity to dedicate time to any of them. So let’s break the silence with a quick one!
As a data scientist you are often swapping files with users on various platforms. One annoyance when delivering plain text files is the different line-endings of Windows (carriage return AND line feed, or \r\n
) versus Linux and Mac (just a line feed, or \n
). This is not a big deal typically, but not a whole lot of fun either.
base::write.csv()
and data.table::fwrite()
each have an eol
argument to specify the line ending, but {readr}
in the tidyverse was still lacking this option. Happily though, since March 20th the dev version finally allows it: check out this commit.
For instance:
data(iris)
## For Windows
readr::write_csv(iris, path = "iris_crlf.csv", eol = "\r\n")
## For Linux / Mac
readr::write_csv(iris, path = "iris_lf.csv", eol = "\n")
We can easily check in the terminal that it worked:
cat -e iris_crlf.csv | head -n 5
## Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species^M$
## 5.1,3.5,1.4,0.2,setosa^M$
## 4.9,3,1.4,0.2,setosa^M$
## 4.7,3.2,1.3,0.2,setosa^M$
## 4.6,3.1,1.5,0.2,setosa^M$
cat -e iris_lf.csv | head -n 5
## Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species$
## 5.1,3.5,1.4,0.2,setosa$
## 4.9,3,1.4,0.2,setosa$
## 4.7,3.2,1.3,0.2,setosa$
## 4.6,3.1,1.5,0.2,setosa$
It’s the little things that count! No more using a different CSV writer just to get the correct line endings.
To use this new feature, install the latest version using remotes::install_github("tidyverse/readr")
.