我想提取myproducta
和myproductb
.
我认为用正则表达式可以,但只适用于:cc string,但不适用于aa
.怎么会这样?两者长度相同.
aa <- "e220juju_uk_yy_aon_aon_conversion_mystore_facebook-network_ppl_primaria_myproducta_galaxycombos_20220520"
cc <- "e220tyty_bo_oo_aon_aon_conversion_mystore_facebook-network_ppl_lal_myproductb_wd95m4473mw_diasdecyber_20220718"
正则表达式部分:
gsub(cc, pattern = ".*_.*_.*_.*_.*_.*_.*_.*_.*_(.*)_.*_.*_.*", replacement = "\\1", perl = TRUE) #works: returns: myproductb
gsub(aa, pattern = ".*_.*_.*_.*_.*_.*_.*_.*_.*_(.*)_.*_.*_.*", replacement = "\\1", perl = TRUE) #don't work: returns: primaria
You can use anchors and a negated character class, and then repeat 10 times matching an underscore before capturing the 11th occurrence.
^(?:[^_]*_){10}([^_]*).*$
aa <- "e220juju_uk_yy_aon_aon_conversion_mystore_facebook-network_ppl_primaria_myproducta_galaxycombos_20220520"
cc <- "e220tyty_bo_oo_aon_aon_conversion_mystore_facebook-network_ppl_lal_myproductb_wd95m4473mw_diasdecyber_20220718"
pattern <- "^(?:[^_]*_){10}([^_]*).*$"
gsub(pattern, "\\1", aa, perl = TRUE)
gsub(pattern, "\\1", cc, perl = TRUE)
Output:
[1] "myproducta"
[1] "myproductb"
以下是一些方法