вставьте строку в значение на основе условия столбца

#r #string #character #paste

Вопрос:

У меня есть фрейм данных, который выглядит так:

 df <- tibble::tribble(
  ~subcateg, ~names, ~categ, ~names2,
  "A00", "Kidney failure", "A00", "Kidney failure",
  "A001", "Kidney failure reason1", "A00", "Kidney failure",
  "A002", "Kidney failure reason2","A00", "Kidney failure",
  "A003", "Kidney failure reason3","A00", "Kidney failure",
  "B00", "Heart failure", "B00", "Heart failure",
  "B001", "Heart failure reason1",  "B00", "Heart failure",
  "B002", "Heart failure reason2",  "B00", "Heart failure",
  "B003", "Heart failure reason3",  "B00", "Heart failure",
  "C00", "Lung failure", "C00", "Lung failure",
  "C001", "Lung failure reason1",  "C00", "Lung failure",
  "C002", "Lung failure reason2",  "C00", "Lung failure",
  "C003", "Lung failure reason3",  "C00", "Lung failure",
)
 

И мне нужно добавить «X» к значениям столбца 1, которые имеют только 3 chr. Это значения, в которых столбцы subcateg=categ, поэтому это должно выглядеть так:

     df <- tibble::tribble(
  ~subcateg, ~names, ~categ, ~names2,
  "A00X", "Kidney failure", "A00", "Kidney failure",
  "A001", "Kidney failure reason1", "A00", "Kidney failure",
  "A002", "Kidney failure reason2","A00", "Kidney failure",
  "A003", "Kidney failure reason3","A00", "Kidney failure",
  "B00X", "Heart failure", "B00", "Heart failure",
  "B001", "Heart failure reason1",  "B00", "Heart failure",
  "B002", "Heart failure reason2",  "B00", "Heart failure",
  "B003", "Heart failure reason3",  "B00", "Heart failure",
  "C00X", "Lung failure", "C00", "Lung failure",
  "C001", "Lung failure reason1",  "C00", "Lung failure",
  "C002", "Lung failure reason2",  "C00", "Lung failure",
  "C003", "Lung failure reason3",  "C00", "Lung failure",
)
 

Я пробовал что-то подобное:

 df %>%
  filter(subcateg = categ)  %>%
  paste0(df$subcateg, "X")
 

но это не работает.
Есть какие-нибудь идеи?

Спасибо!

Ответ №1:

Мы могли бы использовать nchar в ifelse заявлении:

 
library(dplyr)
df %>% 
  mutate(subcateg= ifelse(nchar(subcateg)==3, paste0(subcateg, "X"), subcateg))
 
    subcateg names                  categ names2        
   <chr>    <chr>                  <chr> <chr>         
 1 A00X     Kidney failure         A00   Kidney failure
 2 A001     Kidney failure reason1 A00   Kidney failure
 3 A002     Kidney failure reason2 A00   Kidney failure
 4 A003     Kidney failure reason3 A00   Kidney failure
 5 B00X     Heart failure          B00   Heart failure 
 6 B001     Heart failure reason1  B00   Heart failure 
 7 B002     Heart failure reason2  B00   Heart failure 
 8 B003     Heart failure reason3  B00   Heart failure 
 9 C00X     Lung failure           C00   Lung failure  
10 C001     Lung failure reason1   C00   Lung failure  
11 C002     Lung failure reason2   C00   Lung failure  
12 C003     Lung failure reason3   C00   Lung failure 
 

Ответ №2:

Мы можем использовать case_when для создания логического вектора, основанного на nchar значении 3 в «подстроке», чтобы paste ( str_c ) «X» в конце «подстроки»

 library(dplyr)
library(stringr)
df %>% 
   mutate(subcateg = case_when(nchar(subcateg) == 3 ~ 
         str_c(subcateg, 'X'), TRUE ~ subcateg))
 

-оуптут

 # A tibble: 12 × 4
   subcateg names                  categ names2        
   <chr>    <chr>                  <chr> <chr>         
 1 A00X     Kidney failure         A00   Kidney failure
 2 A001     Kidney failure reason1 A00   Kidney failure
 3 A002     Kidney failure reason2 A00   Kidney failure
 4 A003     Kidney failure reason3 A00   Kidney failure
 5 B00X     Heart failure          B00   Heart failure 
 6 B001     Heart failure reason1  B00   Heart failure 
 7 B002     Heart failure reason2  B00   Heart failure 
 8 B003     Heart failure reason3  B00   Heart failure 
 9 C00X     Lung failure           C00   Lung failure  
10 C001     Lung failure reason1   C00   Lung failure  
11 C002     Lung failure reason2   C00   Lung failure  
12 C003     Lung failure reason3   C00   Lung failure