네이버 실시간 검색어 크롤링하기
먼저 크롬을 열고 네이버데이터랩으로 이동!
remDr$open()
url='https://datalab.naver.com/keyword/realtimeList.naver?where=main'
remDr$navigate(url)
source를 가져온 후 text로 변환
html <- remDr$getPageSource()[[1]]
html <- read_html(html)
sWords <- html %>% html_nodes("div.rank_inner") %>% html_text()
str_split을 통해 분리
sWords=str_split(sWords,'\n')
n=length(sWords)
단어를 클랜징
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
제목 추출
title=sWords2[1]
sWords2=sWords2[-1]
df에 데이터프래임 생성
(옵션에 stringsAsFactors=F를 주면,마지막에 정렬해주지 않아도 된다.)
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=df
반복
for(i in 2:n){
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=merge(data,df,by='rank')
}
data$rank=as.numeric(data$rank)
data[order(data$rank),]
data
remDr$close()
전체 코드
remDr$open()
url='https://datalab.naver.com/keyword/realtimeList.naver?where=main'
remDr$navigate(url)
html <- remDr$getPageSource()[[1]]
html <- read_html(html)
sWords <- html %>% html_nodes("div.rank_inner") %>% html_text()
sWords=str_split(sWords,'\n')
n=length(sWords)
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=df
for(i in 2:n){
sWords2=gsub(' ','',sWords[[i]])
sWords2=sWords2[nchar(sWords2)!=0]
title=sWords2[1]
sWords2=sWords2[-1]
df=data.frame(rank=sWords2[seq(1,length(sWords2),2)],sWords2[seq(2,length(sWords2),2)])
colnames(df)=c('rank',title)
data=merge(data,df,by='rank')
}
data$rank=as.numeric(data$rank)
data[order(data$rank),]
data
'R > crawling' 카테고리의 다른 글
동네예보 최종 (0) | 2019.08.07 |
---|---|
terminal code R을 이용해 실행하기 (0) | 2019.07.31 |
기상청 자료 다운로드 (0) | 2019.05.22 |
XML package를 활용한 정적 크롤링 (0) | 2019.04.15 |
PlotGoogleMaps 사용해 AWS, ASOS 위치 나타내기 (0) | 2019.04.11 |