data analysis & visualization

먼저 크롬을 열고 네이버데이터랩으로 이동!

remDr$open()
url='https://datalab.naver.com/keyword/realtimeList.naver?where=main'
remDr$navigate(url)

source를 가져온 후 text로 변환


html <- remDr$getPageSource()[[1]] 
html <- read_html(html)
sWords <- html %>% html_nodes("div.rank_inner") %>% html_text()

str_split을 통해 분리
sWords=str_split(sWords,'\n')

n=length(sWords)

단어를 클랜징
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]

제목 추출
title=sWords2[1]
sWords2=sWords2[-1]

df에 데이터프래임 생성

(옵션에 stringsAsFactors=F를 주면,마지막에 정렬해주지 않아도 된다.)

df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=df

반복


for(i in 2:n){
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=merge(data,df,by='rank')
}

data$rank=as.numeric(data$rank)
data[order(data$rank),]
data


remDr$close()

 

전체 코드

remDr$open()
url='https://datalab.naver.com/keyword/realtimeList.naver?where=main'
remDr$navigate(url)

html <- remDr$getPageSource()[[1]] 
html <- read_html(html)
sWords <- html %>% html_nodes("div.rank_inner") %>% html_text()
sWords=str_split(sWords,'\n')

n=length(sWords)

sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=df

for(i in 2:n){
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=merge(data,df,by='rank')
}
data$rank=as.numeric(data$rank)
data[order(data$rank),]
data

'R > crawling' 카테고리의 다른 글

동네예보 최종  (0) 2019.08.07
terminal code R을 이용해 실행하기  (0) 2019.07.31
기상청 자료 다운로드  (0) 2019.05.22
XML package를 활용한 정적 크롤링  (0) 2019.04.15
PlotGoogleMaps 사용해 AWS, ASOS 위치 나타내기  (0) 2019.04.11