Rows: 3 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): status, dunno
date (1): date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
read_csv guessed that the 1st column is dates, but not 3rd.
The data as read in
ddd
# A tibble: 3 × 3
date status dunno
<date> <chr> <chr>
1 2011-08-03 hello August 3 2011
2 2011-11-15 still here November 15 2011
3 2012-02-01 goodbye February 1 2012
Dates in other formats
Preceding shows that dates should be stored as text in format yyyy-mm-dd (ISO standard).
To deal with dates in other formats, use package lubridate and convert. For example, dates in US format with month first:
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `uk = dmy(usdates)`.
Caused by warning:
! 2 failed to parse.
# A tibble: 3 × 2
usdates uk
<chr> <date>
1 05/27/2012 NA
2 01/03/2016 2016-03-01
3 12/31/2015 NA
For UK-format dates with month second, one of these dates is legit (but wrong), but the other two make no sense.
Our data frame’s last column:
Back to this:
ddd
# A tibble: 3 × 3
date status dunno
<date> <chr> <chr>
1 2011-08-03 hello August 3 2011
2 2011-11-15 still here November 15 2011
3 2012-02-01 goodbye February 1 2012
Month, day, year in that order.
so interpret as such
(ddd %>%mutate(date2 =mdy(dunno)) -> d4)
# A tibble: 3 × 4
date status dunno date2
<date> <chr> <chr> <date>
1 2011-08-03 hello August 3 2011 2011-08-03
2 2011-11-15 still here November 15 2011 2011-11-15
3 2012-02-01 goodbye February 1 2012 2012-02-01
Are they really the same?
Column date2 was correctly converted from column dunno:
d4 %>%mutate(equal =identical(date, date2))
# A tibble: 3 × 5
date status dunno date2 equal
<date> <chr> <chr> <date> <lgl>
1 2011-08-03 hello August 3 2011 2011-08-03 TRUE
2 2011-11-15 still here November 15 2011 2011-11-15 TRUE
3 2012-02-01 goodbye February 1 2012 2012-02-01 TRUE
Rows: 3 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: " "
dbl (3): year, month, day
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Our starting point
dates0
# A tibble: 3 × 3
year month day
<dbl> <dbl> <dbl>
1 1970 1 1
2 2007 9 4
3 1940 4 15
unite glues things together with an underscore between them (if you don’t specify anything else). Syntax: first thing is new column to be created, other columns are what to make it out of.
unite makes the original variable columns year, month, day disappear.
The column dates_text is text, while dates is a real date.
“Australian Eastern Time”, Standard or Daylight. Note when the Australian summer is.
How long between date-times?
We may need to calculate the time between two events. For example, these are the dates and times that some patients were admitted to and discharged from a hospital:
Rows: 3 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dttm (2): admit, discharge
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Comments
days(1),months(1)etc.