資料輸入與輸出

Working with data provided by R packages is a great way to learn the tools of data science, but at some point you want to stop learning and start working with your own data.

Hadley Wickham

通過自訂函數函數型程式設計的介紹,終於可以鬆一口氣了!基本功部分已經告一段落,接下來我們將會有水到渠成的感覺。先前我們利用手動的方式建立單純的資料結構來學習,然而在現實生活中,R 語言的應用情形多是輸入一組資料再針對它進行清理或者分析,最後再將結果輸出成一個處理後的檔案。

內建資料

R 語言有非常豐富的內建資料,這些資料絕大多數是以資料框(data.frame)這個結構儲存,並且在開啟 R 語言之後就可以直接使用,想要知道有哪些內建資料可以使用,只要使用 data() 函數就可以在來源(Source)區塊瀏覽一份內建資料集清單。

data()
1
## Data sets in package ‘datasets’:
## 
## AirPassengers           Monthly Airline Passenger Numbers 1949-1960
## BJsales                 Sales Data with Leading Indicator
## BJsales.lead (BJsales)
##                         Sales Data with Leading Indicator
## BOD                     Biochemical Oxygen Demand
## CO2                     Carbon Dioxide Uptake in Grass Plants
## ChickWeight             Weight versus age of chicks on different diets
## DNase                   Elisa assay of DNase
## EuStockMarkets          Daily Closing Prices of Major European Stock
##                         Indices, 1991-1998
## Formaldehyde            Determination of Formaldehyde
## HairEyeColor            Hair and Eye Color of Statistics Students
## Harman23.cor            Harman Example 2.3
## Harman74.cor            Harman Example 7.4
## Indometh                Pharmacokinetics of Indomethacin
## InsectSprays            Effectiveness of Insect Sprays
## JohnsonJohnson          Quarterly Earnings per Johnson & Johnson Share
## LakeHuron               Level of Lake Huron 1875-1972
## LifeCycleSavings        Intercountry Life-Cycle Savings Data
## Loblolly                Growth of Loblolly pine trees
## Nile                    Flow of the River Nile
## Orange                  Growth of Orange Trees
## OrchardSprays           Potency of Orchard Sprays
## PlantGrowth             Results from an Experiment on Plant Growth
## Puromycin               Reaction Velocity of an Enzymatic Reaction
## Seatbelts               Road Casualties in Great Britain 1969-84
## Theoph                  Pharmacokinetics of Theophylline
## Titanic                 Survival of passengers on the Titanic
## ToothGrowth             The Effect of Vitamin C on Tooth Growth in
##                         Guinea Pigs
## UCBAdmissions           Student Admissions at UC Berkeley
## UKDriverDeaths          Road Casualties in Great Britain 1969-84
## UKgas                   UK Quarterly Gas Consumption
## USAccDeaths             Accidental Deaths in the US 1973-1978
## USArrests               Violent Crime Rates by US State
## USJudgeRatings          Lawyers' Ratings of State Judges in the US
##                         Superior Court
## USPersonalExpenditure   Personal Expenditure Data
## UScitiesD               Distances Between European Cities and Between
##                         US Cities
## VADeaths                Death Rates in Virginia (1940)
## WWWusage                Internet Usage per Minute
## WorldPhones             The World's Telephones
## ability.cov             Ability and Intelligence Tests
## airmiles                Passenger Miles on Commercial US Airlines,
##                         1937-1960
## airquality              New York Air Quality Measurements
## anscombe                Anscombe's Quartet of 'Identical' Simple Linear
##                         Regressions
## attenu                  The Joyner-Boore Attenuation Data
## attitude                The Chatterjee-Price Attitude Data
## austres                 Quarterly Time Series of the Number of
##                         Australian Residents
## beaver1 (beavers)       Body Temperature Series of Two Beavers
## beaver2 (beavers)       Body Temperature Series of Two Beavers
## cars                    Speed and Stopping Distances of Cars
## chickwts                Chicken Weights by Feed Type
## co2                     Mauna Loa Atmospheric CO2 Concentration
## crimtab                 Student's 3000 Criminals Data
## discoveries             Yearly Numbers of Important Discoveries
## esoph                   Smoking, Alcohol and (O)esophageal Cancer
## euro                    Conversion Rates of Euro Currencies
## euro.cross (euro)       Conversion Rates of Euro Currencies
## eurodist                Distances Between European Cities and Between
##                         US Cities
## faithful                Old Faithful Geyser Data
## fdeaths (UKLungDeaths)
##                         Monthly Deaths from Lung Diseases in the UK
## freeny                  Freeny's Revenue Data
## freeny.x (freeny)       Freeny's Revenue Data
## freeny.y (freeny)       Freeny's Revenue Data
## infert                  Infertility after Spontaneous and Induced
##                         Abortion
## iris                    Edgar Anderson's Iris Data
## iris3                   Edgar Anderson's Iris Data
## islands                 Areas of the World's Major Landmasses
## ldeaths (UKLungDeaths)
##                         Monthly Deaths from Lung Diseases in the UK
## lh                      Luteinizing Hormone in Blood Samples
## longley                 Longley's Economic Regression Data
## lynx                    Annual Canadian Lynx trappings 1821-1934
## mdeaths (UKLungDeaths)
##                         Monthly Deaths from Lung Diseases in the UK
## morley                  Michelson Speed of Light Data
## mtcars                  Motor Trend Car Road Tests
## nhtemp                  Average Yearly Temperatures in New Haven
## nottem                  Average Monthly Temperatures at Nottingham,
##                         1920-1939
## npk                     Classical N, P, K Factorial Experiment
## occupationalStatus      Occupational Status of Fathers and their Sons
## precip                  Annual Precipitation in US Cities
## presidents              Quarterly Approval Ratings of US Presidents
## pressure                Vapor Pressure of Mercury as a Function of
##                         Temperature
## quakes                  Locations of Earthquakes off Fiji
## randu                   Random Numbers from Congruential Generator
##                         RANDU
## rivers                  Lengths of Major North American Rivers
## rock                    Measurements on Petroleum Rock Samples
## sleep                   Student's Sleep Data
## stack.loss (stackloss)
##                         Brownlee's Stack Loss Plant Data
## stack.x (stackloss)     Brownlee's Stack Loss Plant Data
## stackloss               Brownlee's Stack Loss Plant Data
## state.abb (state)       US State Facts and Figures
## state.area (state)      US State Facts and Figures
## state.center (state)    US State Facts and Figures
## state.division (state)
##                         US State Facts and Figures
## state.name (state)      US State Facts and Figures
## state.region (state)    US State Facts and Figures
## state.x77 (state)       US State Facts and Figures
## sunspot.month           Monthly Sunspot Data, from 1749 to "Present"
## sunspot.year            Yearly Sunspot Data, 1700-1988
## sunspots                Monthly Sunspot Numbers, 1749-1983
## swiss                   Swiss Fertility and Socioeconomic Indicators
##                         (1888) Data
## treering                Yearly Treering Data, -6000-1979
## trees                   Girth, Height and Volume for Black Cherry Trees
## uspop                   Populations Recorded by the US Census
## volcano                 Topographic Information on Auckland's Maunga
##                         Whau Volcano
## warpbreaks              The Number of Breaks in Yarn during Weaving
## women                   Average Heights and Weights for American Women
## 
## 
## Use ‘data(package = .packages(all.available = TRUE))’
## to list the data sets in all *available* packages.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130

內建資料集

內建資料集是讓我們可以快速測試函數的好幫手,常常被眾多 R 語言使用者用來測試的內建資料集有 iriscarsmtcarsairquality 等,如果對內建資料集的內容感到好奇,可以像查詢函數一般使用 ?help() 函數查詢它們,右下角的查詢區塊就會顯示出詳細的文件資料。

?iris # help(iris)
1

查詢內建資料集

輸入表格式資料:read.table() 函數

如果想要輸入一個表格式資料(tabular data),可以使用 R 內建的 read.table() 函數,表格式資料中每一筆代表一個觀測值,變數以符號區隔,常見的有空格(\s)、tab 鍵(\t)或逗號(,)。假如在電腦的桌面上 c:/Users/YOURUSERNAME/Desktop/ 有一個以空格(\s)區隔變數的表格式資料 friends_cast.txt。

"cast" "star"
"Rachel Green" "Jennifer Aniston"
"Monica Geller" "Courteney Cox"
"Phoebe Buffay" "Lisa Kudrow"
"Joey Tribbiani" "Matt LeBlanc"
"Chandler Bing" "Matthew Perry"
"Ross Geller" "David Schwimmer"
1
2
3
4
5
6
7

使用 read.table() 函數讀入 .txt 檔案。

file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.txt"
friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE)
View(friends_cast)
1
2
3
## > file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.txt"
## > friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE)
## > View(friends_cast)
1
2
3

讀入 .txt 檔案

值得注意的地方有三個:

  1. file_path 參數用於指定資料的路徑,這個範例是假設我將表格式資料儲存在 c:/Users/YOURUSERNAME/Desktop/ 路徑,亦即電腦的桌面。如果是 macOS 的使用者,需要寫成 /Users/YOURUSERNAME/Desktop/favorite_bands.txt`,特別注意斜線的方向,不論 windows 系統或 macOS 都必須使用正向斜線(/)區隔路徑
  2. header 參數指定資料的第一列觀測值是否為變數名稱
  3. stringsAsFactors 參數指定資料中的文字向量是否要以因素向量的結構型態儲存,在資料處理的階段我們會更偏愛使用文字向量而非因素向量。

假如在電腦的桌面上 c:/Users/YOURUSERNAME/Desktop/ 有一個以 tab(\t)區隔變數的表格式資料 friends_cast.tsv,副檔名 .tsv 是 tab-separated values 的縮寫。

"cast"  "star"
"Rachel Green"  "Jennifer Aniston"
"Monica Geller" "Courteney Cox"
"Phoebe Buffay" "Lisa Kudrow"
"Joey Tribbiani"  "Matt LeBlanc"
"Chandler Bing" "Matthew Perry"
"Ross Geller" "David Schwimmer"
1
2
3
4
5
6
7

使用 read.table() 函數讀入 .tsv 檔案,讀入這個資料的寫法不需要修改。

file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.tsv"
friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE)
View(friends_cast)
1
2
3
## > file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.tsv"
## > friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE)
## > View(friends_cast)
1
2
3

讀入 .tsv 檔案

假如在電腦的桌面上 c:/Users/YOURUSERNAME/Desktop/ 有一個以逗號(,)區隔變數的表格式資料 friends_cast.csv,副檔名 .csv 是 comma-separated values 的縮寫。

"cast","star"
"Rachel Green","Jennifer Aniston"
"Monica Geller","Courteney Cox"
"Phoebe Buffay","Lisa Kudrow"
"Joey Tribbiani","Matt LeBlanc"
"Chandler Bing","Matthew Perry"
"Ross Geller","David Schwimmer"
1
2
3
4
5
6
7

讀入這個資料的寫法需要修改 sep 參數,因為預設 sep 參數只能辨識一個或多個空格,既然現在改以逗號區隔變數,就得指定對應的分隔符號。

file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.csv"
friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE, sep = ",")
View(friends_cast)
1
2
3
## > file_path <- "/Users/YOURUSERNAME/Desktop/friends_cast.csv"
## > friends_cast <- read.table(file_path, header = TRUE, stringsAsFactors = FALSE, sep = ",")
## > View(friends_cast)
1
2
3

讀入 .csv 檔案

前面的範例是從本機路徑讀取檔案,其實 read.table() 函數也可以從網路讀取表格式資料,只需要修改資料所在路徑,從一個本機路徑替換為一個網路位址即可。

file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_cast.csv"
friends_cast <- read.table(file_url, header = TRUE, stringsAsFactors = FALSE, sep = ",")
1
2
## > file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_cast.csv"
## > friends_cast <- read.table(file_url, header = TRUE, stringsAsFactors = FALSE, sep = ",")
## > View(friends_cast)
1
2
3

從網路空間讀入 .csv 檔案

輸入非表格式資料:readLines() 函數

並非所有我們想讀入的檔案都是以分隔符號簡潔分隔的表格式資料,像是劇本形式的文字檔案(friends_script.txt)。

Ross: (mortified) Hi.
Joey: This guy says hello, I wanna kill myself.
Monica: Are you okay, sweetie?
Ross: I just feel like someone reached down my throat, grabbed my small intestine, pulled it out of my mouth and tied it around my neck...
Chandler: Cookie?
Monica: (explaining to the others) Carol moved her stuff out today.
Joey: Ohh.
Monica: (to Ross) Let me get you some coffee.
Ross: Thanks.
Phoebe: Ooh! Oh! (She starts to pluck at the air just in front of Ross.)
Ross: No, no don't! Stop cleansing my aura! No, just leave my aura alone, okay?
Phoebe: Fine!  Be murky!
1
2
3
4
5
6
7
8
9
10
11
12

我們可以利用 readLines() 函數一列一列讀入這些文字,回傳的物件是一個文字向量,並依照順序放置在一個文字向量中,每一列佔據一個索引值的空間。

file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_script.txt"
friends_script <- readLines(file_url)
friends_script
1
2
3
## > file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_script.txt"
## > friends_script <- readLines(file_url)
## > friends_script
##  [1] "Ross: (mortified) Hi."                                                                                                                     
##  [2] "Joey: This guy says hello, I wanna kill myself."                                                                                           
##  [3] "Monica: Are you okay, sweetie?"                                                                                                            
##  [4] "Ross: I just feel like someone reached down my throat, grabbed my small intestine, pulled it out of my mouth and tied it around my neck..."
##  [5] "Chandler: Cookie?"                                                                                                                         
##  [6] "Monica: (explaining to the others) Carol moved her stuff out today."                                                                       
##  [7] "Joey: Ohh."                                                                                                                                
##  [8] "Monica: (to Ross) Let me get you some coffee."                                                                                             
##  [9] "Ross: Thanks."                                                                                                                             
## [10] "Phoebe: Ooh! Oh! (She starts to pluck at the air just in front of Ross.)"                                                                  
## [11] "Ross: No, no don't! Stop cleansing my aura! No, just leave my aura alone, okay?"                                                           
## [12] "Phoebe: Fine!  Be murky!"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

假使原始文字檔非常龐大,我們可以加入參數 n 限定讀入的筆數。

file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_script.txt"
friends_script <- readLines(file_url, n = 2)
friends_script
1
2
3
## > file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_script.txt"
## > friends_script <- readLines(file_url, n = 2)
## > friends_script
## [1] "Ross: (mortified) Hi."                          
## [2] "Joey: This guy says hello, I wanna kill myself."
1
2
3
4
5

輸入常見資料格式

常見用來儲存資料的格式還有 Excel 試算表(spreadsheets)與 JSON(JavaScript Object Notations),我們必須仰賴套件來輔助輸入這些資料格式。

資料格式 套件 開發者
Excel 試算表 readxl 套件 Hadley Wickham
JSON jsonlite 套件 Jeroen Ooms

R 語言使用套件的程序有兩個階段,一個是安裝,另外一個是載入。這兩者的區別就像是買工具書跟查詢工具書,安裝套件就像是將這本工具書買回家裡放置在書櫃;載入套件就像是有需求時將工具書從書櫃取下查詢。我們使用 install.pacakges() 函數來安裝套件,然後再使用 library() 函數載入套件。

  • 安裝套件:install.pacakges(),同樣的 R 版本執行一次
  • 載入套件:library() ,每次使用套件都要執行

安裝與載入套件

我們可以利用前述兩個函數進行套件的安裝與載入。

# 安裝與載入套件
pkgs <- c("readxl", "jsonlite")
install.packages(pkgs) # 如果先前已經安裝過就不用執行這行
library(readxl)
library(jsonlite)
1
2
3
4
5
## > # 安裝與載入套件
## > pkgs <- c("readxl", "jsonlite")
## > install.packages(pkgs) # 如果先前已經安裝過就不用執行這行
## trying URL 'https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.5/readxl_1.1.0.tgz'
## Content type 'application/x-gzip' length 1498484 bytes (1.4 MB)
## ==================================================
## downloaded 1.4 MB
## 
## trying URL 'https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.5/jsonlite_1.6.tgz'
## Content type 'application/x-gzip' length 1114907 bytes (1.1 MB)
## ==================================================
## downloaded 1.1 MB
## 
## 
## The downloaded binary packages are in
##  /var/folders/0b/r__z5mpn6ldgb_w2j7_y_ntr0000gn/T//Rtmp9VNgx9/downloaded_packages
## > library(readxl)
## > library(jsonlite)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

或者利用 RStudio 右下角的 Packages 區塊提供的使用者介面來安裝與載入套件。

於右下角的 Packages 區塊點選 Install

輸入要安裝的套件名稱

安裝完成之後,把方框勾選起來表示載入 readxl

安裝完成之後,把方框勾選起來表示載入 jsonlite

輸入 Excel 試算表:read_excel()

假如在電腦的桌面上 c:/Users/YOURUSERNAME/Desktop/ 有一個 Excel 試算表 friends_cast.xlsx。

Excel 試算表

不論是使用指令安裝載入套件、或是以使用者介面安裝載入套件,完成以後都是利用 readxl 套件提供的 read_excel() 函數讀入 Excel 試算表。

library(readxl)

file_path <- "c:/Users/YOURUSERNAME/Desktop/friends_cast.xlsx"
friends_cast <- read_excel(file_path)
View(friends_cast)
1
2
3
4
5
## > library(readxl)
## > 
## > file_path <- "~/Desktop/friends_cast.xlsx"
## > friends_cast <- read_excel(file_path)
## > View(friends_cast)
1
2
3
4
5

讀入 .xlsx 檔案

輸入 JSON:fromJSON()

JSON(JavaScript Object Notation)是一種輕量級的文字檔案,常用於做資料交換,這種資料結構在 R 語言中的對應主要為有命名的清單(named list),處理具有鍵(Key)以及值(Value)的資料結構,表格式資料若以 JSON 儲存,則是以 Array of JSON 的形式。

[
  {
    "cast": "Rachel Green",
    "star": "Jennifer Aniston"
  },
  {
    "cast": "Monica Geller",
    "star": "Courteney Cox"
  },
  {
    "cast": "Phoebe Buffay",
    "star": "Lisa Kudrow"
  },
  {
    "cast": "Joey Tribbiani",
    "star": "Matt LeBlanc"
  },
  {
    "cast": "Chandler Bing",
    "star": "Matthew Perry"
  },
  {
    "cast": "Ross Geller",
    "star": "David Schwimmer"
  }
]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

我們可以利用 jsonlite 套件提供的 fromJSON() 函數讀入 Array of JSON 成為資料框。

library(jsonlite)

file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_cast.json"
friends_cast <- fromJSON(file_url)
View(friends_cast)
1
2
3
4
5
## > library(jsonlite)
## > 
## > file_url <- "https://s3-ap-northeast-1.amazonaws.com/r-essentials/friends_cast.json"
## > friends_cast <- fromJSON(file_url)
## > View(friends_cast)
1
2
3
4
5

讀入 .json 檔案

輸出表格式資料:write.table()

我們可以將 R 語言的資料框以 write.table() 函數輸出副檔名為 .txt 的純文字檔,預設使用空格(\s)區隔變數。

star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
write.table(friends_cast, file = "c:/Users/YOURUSERNAME/Desktop/friends_cast.txt",row.names = FALSE)
1
2
3
4
## > star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
## > cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
## > friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
## > write.table(friends_cast, file = "c:/Users/YOURUSERNAME/Desktop/friends_cast.txt",row.names = FALSE)
1
2
3
4

執行完畢後可以在 c:/Users/YOURUSERNAME/Desktop 路徑(桌面)找到 friends_cast.txt 檔案,這裡值得注意的是 file 參數指定的是期望輸出檔案的路徑,row.names 參數指定不要將資料框的列索引值輸出,這樣一來輸出檔案的外觀比較接近平時熟悉的樣子。

輸出 .txt 檔案

write.table() 函數指定參數 sep="," ,可輸出以逗號區隔的純文字檔,副檔名為 .csv。執行完畢後可以在 c:/Users/YOURUSERNAME/Desktop 路徑(桌面)找到 friends_cast.csv 檔案。

star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
write.table(friends_cast, file = "c:/Users/YOURUSERNAME/Desktop/friends_cast.csv",row.names = FALSE, sep = ",")
1
2
3
4
## > star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
## > cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
## > friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
## > write.table(friends_cast, file = "/Users/kuoyaojen/Desktop/friends_cast.txt",row.names = FALSE)
## > write.table(friends_cast, file = "c:/Users/YOURUSERNAME/Desktop/friends_cast.csv",row.names = FALSE, sep = ",")
1
2
3
4
5

輸出 .csv 檔案

輸出非表格式資料:toJSON()

我們可以將 R 語言的資料框以 toJSON() 函數輸出副檔名為 .json 的純文字檔,資料框會對應生成 Array of JSON 的文字向量,接著再以 writeLines() 函數輸出。執行完畢後可以在 c:/Users/YOURUSERNAME/Desktop 路徑(桌面)找到 friends_cast.json 檔案。

library(jsonlite)

star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
array_of_json <- toJSON(friends_cast)
writeLines(array_of_json, con = "c:/Users/YOURUSERNAME/Desktop/friends_cast.json")
1
2
3
4
5
6
7
## > library(jsonlite)
## > 
## > star <- c("Jennifer Aniston", "Courteney Cox", "Lisa Kudrow", "Matt LeBlanc", "Matthew Perry", "David Schwimmer")
## > cast <- c("Rachel Green", "Monica Geller", "Phoebe Buffay", "Joey Tribbiani", "Chandler Bing", "Ross Geller")
## > friends_cast <- data.frame(cast, star, stringsAsFactors = FALSE)
## > array_of_json <- toJSON(friends_cast)
## > writeLines(array_of_json, con = "c:/Users/YOURUSERNAME/Desktop/friends_cast.json")
1
2
3
4
5
6
7

輸出 .json 檔案

小結

在這個小節中我們簡介如何以 R 語言處理資料的輸入與輸出,瀏覽內建資料、以 read.table() 函數輸入表格式資料、以 readLines() 函數輸入非表格式資料、以 readxl 套件的 read_exce() 函數輸入 Excel 試算表、以 jsonlite 套件的 fromJSON() 函數輸入 JSON 檔案、以 write.table() 函數輸出表格式資料和以 jsonlite 套件的 toJSON() 函數搭配 writeLines() 函數輸出 JSON 檔案。

練習

url <- "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/bezdekIris.data"
iris_df <- read.table(___, header = ___, sep = "___")
1
2
url <- "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/bezdekIris.data"
iris_df <- read.table(___, header = ___, sep = "___")
names(iris_df) <- c("___", "___", "___", "___", "___")
1
2
3
  • 練習將內建資料 cars 輸出為 cars.csv 至本機路徑,記住要指定參數 row.names = FALSE
file_path <- "___" # 自訂
write.csv(cars, file = file_path, row.names = ___)
1
2

延伸閱讀