2016年11月28日 星期一

超實用懶人包!從開場到結尾,「英文簡報」說這幾句話就對了

http://www.businessweekly.com.tw/article.aspx?id=9726&type=Blog&p=3
引用前面的要點
1.I’d like to refer back to my earlier discussion of… 我想回頭引用我們先前討論的…
2.To go back to the idea of… 現在再回到…的構想上
五、強調與加強語氣
強調訊息
1.I cannot overstate the importance of this fact. 我必須特別強調此項事實的重要性
2.This is our fundamental problem. 這是我們最根本的問題
3.This is the main point I want to make today: … 這是我今天想要提出的主要重點…
4.If you take nothing else from this speech with you, take this:…如果這場演說讓你一無所獲的話,那起碼還可以得到這個:…
5.Our focus must be… 我們的焦點必須放在…
==============================
Language usage weblog
https://languagetips.wordpress.com/category/overstateunderstate/
Tip 2: Understated or overstated
A reader writes:
 “Understated/overstated”
As in, “the importance of voting to a democratic society cannot be understated.”  WRONG!
As written, this means, I think, that no matter how small the importance of voting is, no words could possibly make it seem any smaller.  The writer should have used “overstated” here, meaning that all the emphatic expressions imaginable about the importance of voting would not be going too far, given how important voting is.  Or am I wrong?
I think the reader is right. I think the writer is trying to say that voting is so important that we can’t say enough about it. But the reader confused ‘understated’ with ‘overstated,’ which is the word he (I’m just assuming the mistake was made by a he) wanted.
‘Understate’ means to represent something as less than it actually is. ‘Overstate’ is just the opposite, and it means to represent something as greater than it is, to exaggerate. The writer really wants to say that he can’t say enough about the importance of voting.

##" I cannot overstate the important of voting to a democratic society"##
==================================
Tip3: underestimate/overestimate
This is actually a fairly common mistake. And understate/overstate is not alone. Underestimate is often misused when overestimate is the intended word.
You cannot underestimate the devastation that Hurricane Sandy brought.
Well, yes, actually you can. You can easily underestimate it by thinking that little destruction occurred. There was quite a bit of destruction—whole communities were lost. It would be more appropriate to say:
You cannot overestimate the devastation that Hurricane Sandy brought.
==================================
Tip4: Could care less/ couldn't care less
There is a phrase I can think of that suffers from the same type of confusion:
I could care less about the outcome of the Steeler’s game.
What the speaker, here, really means to say is this (assuming the speaker doesn’t care):
I couldn’t care less about the outcome of the Steeler’s game.
This means the speaker doesn’t care at all about whether the Steelers won or not. The first sentence is saying that he does care about the outcome.
[NOTE: Judging by the reactions of the people I saw on Sunday afternoon, the first sentence is really the more accurate one in Pittsburgh.]
But the issue is that ‘could care less’ is frequently used when the speaker (and I say speaker because this construct is rarely written) really means ‘couldn’t care less’ which is the American idiom.
[NOTE: The confusion between the two phrases is so common that some dictionaries consider ‘could care less’ to be an idiom meaning ‘couldn’t care less.’ Yikes!]
Many of the examples I found of these errors—understate or underestimate for overstate or overestimate—are in the political arena. Politicians seem to mistakenly underestimate a lot of things.

2016年11月27日 星期日

R_2016_1118_第一次上課

#################################################
# 第一章
# 基本運算規則
#################################################
## 上課講義
#http://biostat.tmu.edu.tw/R/20161118/#1
# download NPP
# download "NppToR"



## 簡單的數字與字串運算
1+2
3-4
5*6
7/8
9^0  # 等同於9**0
1+2*(3/4)^5  # 先乘除後加減,有括號先運算
sqrt(2)
abs(-1)

## 基本向量運算
x = c(1, 2, 3, 4, 5)  # 等同於1:5
y = x + 1
x*y
length(x)  # 向量長度
sum(y)  # 加總
prod(x)  # 累乘
mean(y)  # 平均
z = c(x, y)

## 向量及矩陣的指標用法
x[1]
y[c(2, 4)]
z[4:6]  # 等同於z[c(4, 5, 6)]
x >= 3
y[x >= 3]             #用logic value to select
x[x <= 2 | y == 5]  # 且(&, &&),或(|, ||), 比較 == 二個等號
length(x[x < 4])
sum(y[y != 6])  # 不等號!=  , !否定的意思
x[-1]
y[-(2:4)]
X = rbind(x, y)  # row bind
Y = cbind(x, y)  # column bind
X[1, 5]
X[, 2]
Y[2 ,]
X[, c(1, 3)]
Y[2:4, -1]
Y[-1, -2]

## 多種指標用法
iris  ## iris dataset, this is a data.frame for practice in R
iris[, 5]
niris[, "Species"]
iris$Species
iris[["Species"]]
iris["Species"]  # 等同於iris[5]
names(iris)
names(iris) = c("A","B","C","D","E")
iris$A
iris[18, c("B","D")]
iris[iris$E == "setosa", 1:4]


#################################################
# 第二章
# 變數型態
#################################################

## R軟體資料屬性: 邏輯真假值(logical,T,F), 整數(integer),
##  雙倍精確度數字(double, real, numeric),
##  複數(complex), 文字字串(character, string), 二進位資料(raw)

1; 20.0; 3e2  # 數值
class(1) #判斷資料型態
"stat"  # 文字
class("stat")
TRUE; T; FALSE; F  # 邏輯真假值 一定要全部大寫 在R裏是保留值
class(T)

## R軟體變數種類: 向量(vector), 矩陣(matrix), 陣列(array),
##  因子(factor), 資料框架(data-frame), 串列(list), 時間數列(ts)

## 向量(vector) ##
x = c(1, 3, 5, 7, 9)  # 建立向量,或聯結不同的向量
is.vector(x)  # 查詢x是否為向量變數
y = c(2, "stat", T)  # 元素屬性需相同  *** 資料與變數, 不同
x[1]; y[-2]  # 向量指標 #指標系統
x[c(1, 3, 5)] = c(2, 4, 6)
c(x, y[1])
length(x)  # 算出元素個數
names(x)  # 查詢或建立向量的元素名稱
names(x) = c("a", "b", "c", "d", "e")
x[c(3, 5)]; x[c("c", "e")]    #指標系統內的使用也是向量# R是建立在向量上

## 陣列(array), 矩陣(matrix) ##
X = array(1:6, c(3, 2))  # matrix(1:6, 3, 2) # 建立陣列變數
Y = array(1:12, c(2, 3, 2))
is.array(X); is.matrix(X)  # 查詢X是否為陣列及矩陣變數
is.array(Y); is.matrix(Y)  # 查詢Y是否為陣列及矩陣變數
rbind(x, x); cbind(x, x)  # 使用rbind及cbind來建立array變數
nrow(X)  # 查詢陣列的列數
ncol(X)  # 查詢陣列的行數
dim(Y)  # 查詢Y陣列的維度
rownames(X) = c("R1", "R2", "R3")
colnames(X) = c("C1", "C2")

## 矩陣(Matrix) ##
X = matrix(1:6, 3, 2)
Y = t(X)  # 轉置
Z = X %*% Y  # 矩陣相乘
diag(Z)  # 對角線函數
det(Z)  # 行列式
A = matrix(1:4, 2, 2)
b = c(2,2)
solve(A)  # 反矩陣
solve(A, b)  # 線性聯立方程式
eigen(Z)  # 特徵值與特徵向量

## 因子(Factor) ##
x = c(1, 1, 1, 2, 2, 2)
y = factor(x)  # 等同於 as.factor(x)
y - 1
levels(y)  # 查詢或設定分類資料
levels(y) = c("一", "二")
nlevels(y)  # 查詢分類數目

## 串列(List) ##   用來儲存不同資料型態
l = list(L1 = x, L2 = y, L3 = Z)  #同時命名,建立LIST#
names(l)
l$L1  # 等同於 l[[1]] 或 l[["L1"]]
l[1]
l[[1]]
class(l[1]);class(l[[1]])  ##比較串列的指標, 出來的資料型態不同##
l[2]  # 等同於 l["L2"]
l$L3[1, 2]
l$L4 = 1:5
c(l, list(L5 = 1:10))

## 資料框架(Data-Frame) ## 比較整齊的LIST
## 它其時是一種LIST, LIST 的特例## 資料型態不同, 但同一行要一樣的資料型態
D = as.data.frame(Z)  # 將變數類型轉為data-frame, 像醫院資料, 有身份, 名字,數值..
D[, 4] = c(T, F, T)
names(D) = c("D1", "D2", "D3", "D4")
===========================================


ge########################################################
##          R 軟體系列課程 - R 軟體入門(一)
##          課堂練習題參考答案
##          2016/11/18
########################################################


####################
##  基本操作環境  ##
####################

## 將工作目錄設定為"D:\R_work\"
setwd("D:/R_work/")  # 注意斜線方向

## 嘗試下載並安裝"rgl"套件
install.packages("rgl")
library(rgl)  # 載入套件

## 利用 demo 功能檢視 rgl 套件中的函數範例
demo(rgl, package="rgl")

## 利用 help 功能查看 rgl 套件中的函數說明
help(surface3d)  # 等同於 ?surface3d

## 嘗試執行 rgl 套件中的函數範例
example(plot3d)


####################
##  基本運算規則  ##
####################

## 生成一組2, 4, 6, 8, 10的向量 ##
c(2, 4, 6, 8, 10)
1:5 * 2  # 亦可
seq(2, 10, by=2)  # 亦可

## 取出 iris 資料集中鳶尾花品種為 setosa 的資料
data(iris)
iris_setosa = iris[iris$Species == "setosa",]
iris_setosa = iris[iris["Species"] == "setosa",]  # 亦可
iris_setosa = iris[iris[,5] == "setosa",]  # 亦可

## 將上述資料依 Sepal.Length 變數進行排序
iris_setosa[order(iris_setosa$Sepal.Length),]


######################
##  R 的變數與資料  ##
######################

## 已知 x = 1:5 以及 y = c("一", "二", "三", "四", "五")
## 將 x 的元素名稱改為甲、乙、丙、丁、戊
x = 1:5
y = c("一", "二", "三", "四", "五")
names(x) = c("甲", "乙", "丙", "丁", "戊")

## 查詢 rbind(x, y) 的維度
rxy = rbind(x, y)
dim(rxy)

## 將 y 變數的型態轉變為因子(factor)變數
y = as.factor(y)

## 將 cbind(x, y) 的型態轉變為資料框架(data-frame)
cxy = cbind(x, y)
cxy = as.data.frame(cxy)

## 建立一個 2*3*4 的三維陣列,其元素為 1:24
array(1:24, dim=c(2,3,4))

## 查詢 rbind(x, y) 的型態是否為陣列(array)
is.array(rxy)

## 將 x, y 合併為一個串列(list),並查詢其第一個元素
lxy = list(X=x, Y=y)
lxy$X; lxy[["X"]]; lxy[[1]]  # 試試 lxy["X"] 或 lxy[1] 的結果,變數型態有何不同?


R_2016_TMU_第二次上課

#################################################
# 第一章
# 資料的輸入與輸出
#################################################
## http://biostat.tmu.edu.tw/R/20161125/#2  ##
setwd("D:/R_work/")  # 設定工作目錄 # 斜線是由右到左!!
#用ls()來看
## 文字檔輸入 ##
babies = read.table("babies.txt", header=T)  ## =是指派,大寫T是邏輯值!
babies = na.exclude(babies)  # 刪除具有遺失值的資料
Iris = read.table("iris_dataset.txt", header=F, sep=",")
IRIS = read.csv("iris.csv", header=F)#.csv的格式也依樣是用逗號分隔
babies1 = read.fwf("babies.txt", header=T)  ## =是指派,大寫T是邏輯值!
## 文字檔輸出 ##
cat(babies$smoke, file="smoke1.txt", sep="")
write(babies$smoke, file="smoke2.txt", sep=",") # ncolumns 5, 每五個換行!
weight = babies[babies$weight < 100,]
height = babies[babies$height > 70,]
write.table(weight, file="weight.txt", sep=",", row.names=F)
write.csv(height, file="height.csv", row.names=F)

## 存取其他軟體的資料檔 ##
library(gdata)
babies_xls = read.xls("babies.xls", sheet=1)  # 讀取xls檔, 需要perl的路徑!
library(xlsx) #用R 32較方便,用64也其他問題!
babies_xlsx = read.xlsx("babies.xlsx", sheetIndex=2)  # 讀取xlsx檔,且指定work sheet number!
write.xlsx(iris, "iris.xlsx", sheetName="iris")  # 匯出xlsx檔, 可以發現速率很慢!
library(sas7bdat)
babies_sas = read.sas7bdat("babies.sas7bdat")  # 讀取sas資料檔
library(foreign)
babies_spss = read.spss("babies.sav", to.data.frame=T)  # 讀取spss資料檔

## 存取R物件 ##
save(weight, height, file="babies.RData")#要存幾個資料都可以 用逗號分隔!
save.image()  # 儲存工作空間
load("babies.RData")  #可以用 ls()來確定 已經把資料存入!


#################################################
# 第二章
# 程式流程控制
#################################################

## 邏輯判斷式 ##
## 運算子優先性:
##  括弧 => 乘除 => 加減 => 比較 => 邏輯 => 指派
x = 1
x == 3
x != 1 + 2
!(x <= 3)
x %in% 1:5  #X有在 1-5裡面嗎?!!
x < 0 || x > 5  # || 表示 or  !
(is.matrix(x) || x >= 0) & (1 < 2)

## 條件執行 ##
x = 1
if (x == 3) y = 10 else y = 20
if (x >= 5) {         #()小瓜號, {}大括號!
y = 15
} else {
y=0
}  # 建議寫法
if (x < 0) {
y = x - 1
} else if (x > 0) {    #else if, 如果不是,我在判斷x, 如果是的話...!
y = x + 1
} else {
y = x
}

## for迴圈 ##       適合我已經知道要跑幾次!!!
y = vector()  # 宣告變數, 因為Y有用到 指標 []!
for (x in 1:5) { #for 迴圈就是先要說明x的範圍!!!
y[x] = sqrt(x)
}
z = 1  #起始值!
for (i in c(2,4,6,8,10)) {
z = z * i
}  # 2*4*6*8*10

## while迴圈 ##      還不清楚要跑幾次!
x = 1; y = vector()
while (x <= 5) { #while 迴圈就是先有一個判斷式!
y[x] = sqrt(x)
x = x + 1
}
z = 1; i = 2
while (i <= 10) {
z = z * i
i = i + 2
}

## repeat迴圈 ##
x = 1; y = vector()
repeat {
y[x] = sqrt(x)
x = x + 1
if (x > 5) break  # 跳離迴圈
}
z = 1; i = 1
repeat {
i = i + 1
if (i > 10) {break} else if (i %% 2 != 0) {next} ## "%%" i 除以 2的餘數 不等於0!!
z = z * i
}  # next: 跳過一次迴圈


#################################################
# 第三章
# 自訂函數
#################################################

## R的自訂函數 ##
# 函數的定義語法:
# 自訂function名稱 = function(參數1, 參數2, ...)
# {
# 完整運算式...
# }

func1 = function(a, b)
{
x = 1+2*3/4        ##這個x只有存在此函數內, 部會影響到外面的x!!
y <<- a + b        ## <<- 指派 把這個y指派到外部的變數了!!!
return(y)  # 預設傳回最後一個運算值,或使用return函數
}
func1(7, 6)
func1(b=3, a=2)

## 參數的預設值 ##
func2 = function(x=0)
{
sum(x)/length(x)
}
func2(1:5)
func2()  # 參數x的預設值為0

## ...參數 ##
func3 = function(x, ...)    ## ... 就把此部分的參數 轉移到函數內
{
y = mean(x, ...) + 1
return(y)
}
x = c(2,4,6,NA,10)
func3(x, trim=0.1)
func3(x, trim=0, na.rm=TRUE)

## 二元運算子 ##
"%p%" = function(a, b)    ## 一定要有雙引號百分比!!
{
factorial(a)/factorial(a-b)
}
5 %p% 2


#################################################
# 第四章
# 程式撰寫技巧
#################################################

## apply函數 ##
apply(iris[-5], 2, max)
func = function(x) x[x < mean(x)]
apply(iris[,1:4], 2, func)

## tapply函數 ##
tapply(iris[,1], iris[,5], min)
index2 = rep(1:2, length=150)
tapply(iris[,2], list(iris[,5],index2), median)

## sapply, lapply函數 ##
sapply(iris, length)   #對每個元素, 優先回傳vector!
lapply(iris, length)   #對每個元素, 回傳list!
sapply(iris[-5], function(x) { which(x > mean(x)) })

## 各方法計算時間比較 ##
x = rnorm(50000)  # 以標準常態分配生成隨機樣本
y1 = y2 = y3 = y4 = vector()
t0 = proc.time()  # 起始時間
for (i in 1:length(x)) {
if (x[i] <= 0) y1[i] = -1 else y1[i] = 1
}
t1 = proc.time() - t0                 # y1的計算時間
y2 = ifelse(x <= 0, -1, 1)
t2 = proc.time() - t0 - t1            # y2的計算時間
y3[x <= 0] = -1; y3[x > 0] = 1
t3 = proc.time() - t0 - t1 - t2       # y3的計算時間
y4 = sapply(x, function(x) {if (x <= 0) -1 else 1})
t4 = proc.time() - t0 - t1 - t2 - t3  # y4的計算時間

## Which is better? ##    #寫程式比的是 執行效率 and 開發成本(可讀性, 註解, 排版, debug)!
aa=read.table("babies.txt",header=TRUE)
bb=na.exclude(aa$smoke);cc=vector()
for(i in 1:length(bb)){if(bb[i]==1) cc[i]="是" else cc[i]="否"}

## 讀入babies資料檔、宣告smoke及new_var變數 ##
babies = read.table("babies.txt", header=TRUE)
smoke = na.exclude(babies$smoke)
new_var = vector()

## 使用迴圈將smoke變數重新編碼並存入new_var變數 ##
for (i in 1:length(smoke)) {
if (smoke[i] == 1) {
new_var[i] = "是"
} else {
new_var[i] = "否"
}
}
================================================


########################################################
##          R 軟體系列課程 - R 軟體入門(二)
##          課堂練習題參考答案
##          2016/11/25
########################################################


########################
##  資料的輸入與輸出  ##
########################

## 讀入外部資料檔"babies.txt"
babies = read.table("babies.txt", header=TRUE)  # 參數視資料檔內容而定

## 分別將 babies 及 iris 匯出成 csv 檔且不包含列名稱
write.csv(babies, file="babies_csv.csv", row.names=FALSE)
write.csv(iris, file="iris_csv.csv", row.names=FALSE)

## 將 babies 及 iris 儲存至同一個 RData 檔
save(babies, iris, file="datasets.RData")


####################
##  程式流程控制  ##
####################

## 利用迴圈輸出一個九九乘法表
for (i in 1:9) {
for (j in 1:9) {
cat(j, "*", i, "=", i*j, sep="")
cat("\t")  # Tab對齊
}
cat("\n")  # 換行
}  # 可嘗試以 while 或 repeat 迴圈改寫

## 利用matrix()函數的性質, 2016/11/27 自寫!
func=function(a)
{y=matrix(,a,a)
for (i in 1:a) {
for(j in 1:a){
y[j,i]=i*j
}
}
y
}


## (亂數)產生一個1~100的整數,進行猜數字遊戲。
## 根據輸入的猜測數字提示大於或小於正確答案!
## Hint:無窮迴圈、scan()
ans = sample(10, 1)  # 從1~100中隨機抽出一個值
repeat {
cat("請輸入一個介於1~100的整數:\n")
guess = scan(n=1, quiet=TRUE)  # 提供使用者輸入介面
if (guess == ans) {
cat("答對囉!答案就是", ans, "!\n", sep="")
break  # 重要!否則就猜不完囉~
} else {
tip = ifelse(ans > guess, "大", "小")
cat("正確答案比", guess, "還要", tip, "喔!再試一次吧!\n", sep="")
}
}


################
##  自訂函數  ##
################

## 定義一個參數為三角形三邊長的函數,回傳值為三角形種類(正三角形、等腰三角形……)
triangle = function(x)
{
sx = sort(x)
if (length(sx) != 3 || (sx[1] + sx[2]) <= sx[3]) {
return("不是三角形")
} else if (sx[1] == sx[3]) {
return("正三角形")
} else if (sx[1] == sx[2] || sx[2] == sx[3]) {
return("等腰三角形")
} else {
return("其他三角形")
}
}  # 嘗試加入判斷直角三角形

## 定義一個可計算階層數的函數
## Hint:利用遞迴呼叫,0! = 1
f = function(x)
{
if (x <= 1) {
return(1)
} else {
return(x * f(x - 1))
}
}

## 定義一個可進行組合數計算的二元運算子
## 註:n 取 r 的組和數 = n! / r! / (n-r)!
"%c%" = function(n, r)
{
f(n) / f(r) / f(n - r)
}


####################
##  程式撰寫技巧  ##
####################

## 計算 babies 資料集中每一個變數的遺失值個數
func = function(x)
{
sum(is.na(x))  # 對邏輯值進行運算時,TRUE = 1;FALSE = 0
}
apply(babies, MARGIN=2, FUN=func)

## 以 smoke 為分組變數,繪製 parity 變數的次數表
tapply(babies$parity, INDEX=babies$smoke, FUN=table)