数据处理中经常会有这样的情况,需要合并多个数据(按行或者按列合并),常规的merge或者rbind只能两个两个合并,操作繁琐。可以使用自写函数或do.call()函数进行数据库的拼接或合并,具体操作如下:

按列合并

mypath<-"C:/Users/18896/Desktop/example1"

multmerge = function(mypath){
  filenames=list.files(path=mypath, pattern = ".XPT",full.names=TRUE)
  datalist = lapply(filenames, function(x){read.xport(file=x)})
  Reduce(function(x,y) {merge(x,y,by="SEQN",all=T)}, datalist)
}

mergedata<-multmerge(mypath)

mypath中为需要合并的所有文件夹的本地目录,定义函数multmerge,先列出需要合并的数据库名称,并读取为list,使用merge函数合并list中的数据框。最后生成的mergedata为合并之后的data

文件夹不在本地时

data1 <- data.frame(id = 1:6,                                  # Create first example data frame
                    x1 = c(5, 1, 4, 9, 1, 2),
                    x2 = c("A", "Y", "G", "F", "G", "Y"))
 
data2 <- data.frame(id = 4:9,                                  # Create second example data frame
                    y1 = c(3, 3, 4, 1, 2, 9),
                    y2 = c("a", "x", "a", "x", "a", "x"))
 
data3 <- data.frame(id = 5:6,                                  # Create third example data frame
                    z1 = c(3, 2),
                    z2 = c("K", "b"))

data_list <- list(data1, data2, data3)

my_merge <- function(df1, df2){                                # Create own merging function
  merge(df1, df2, by = "id")
}

Reduce(my_merge, data_list) 
 
#id x1 x2 y1 y2 z1 z2
#1  5  1  G  3  x  3  K
#2  6  2  Y  4  a  2  b

或者使用tidyverse包

install.packages("tidyverse")                                  # Install tidyverse package
library("tidyverse")
data_list %>% reduce(inner_join, by = "id")                    # Apply reduce function of tidyverse

#id x1 x2 y1 y2 z1 z2
#1  5  1  G  3  x  3  K
#2  6  2  Y  4  a  2  b

按行合并

library(data.table)
DT1 = data.table(A=1:3,B=letters[1:3])
DT2 = data.table(B=letters[4:5],A=4:5)
DT3=data.table(A=6:7,B=letters[6:7])
l = list(DT1,DT2,DT3)
rbindlist(l, use.names=TRUE)
#A B
#1: 1 a
#2: 2 b
#3: 3 c
#4: 4 d
#5: 5 e
#6: 6 f
#7: 7 g

重复合并某个数据框多次

​
do.call("rbind", replicate(4, DT1, simplify = FALSE))

​#   A B
# 1: 1 a
# 2: 2 b
# 3: 3 c
# 4: 1 a
# 5: 2 b
# 6: 3 c
# 7: 1 a
# 8: 2 b
# 9: 3 c
#10: 1 a
#11: 2 b
#12: 3 c

Reference:R Merge Multiple Data Frames in List (2 Examples) | Base R vs. tidyverse (statisticsglobe.com)

Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐