• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

R语言scale与unscale函数

原作者: [db:作者] 来自: [db:来源] 收藏 邀请

一、scale函数

R语言base库中自带数据标准化接口scale函数,函数介绍如下

Usage

scale(x, center = TRUE, scale = TRUE)

 

Arguments

x: a numeric matrix(like object).

center: either a logical value or a numeric vector of length equal to the number of columns of x.

scale: either a logical value or a numeric vector of length equal to the number of columns of x.

 

Details

The value of center determines how column centering is performed. If center is a numeric vector with length equal to the number of columns of x, then each column of x has the corresponding value from center subtracted from it. If center is TRUE then centering is done by subtracting the column means (omitting NAs) of x from their corresponding columns, and if center is FALSE, no centering is done.

The value of scale determines how column scaling is performed (after centering). If scale is a numeric vector with length equal to the number of columns of x, then each column of x is divided by the corresponding value from scale. If scale is TRUE then scaling is done by dividing the (centered) columns of x by their standard deviations if center is TRUE, and the root mean square otherwise. If scale is FALSE, no scaling is done.

The root-mean-square for a (possibly centered) column is defined as sqrt(sum(x^2)/(n-1)), where x is a vector of the non-missing values and n is the number of non-missing values. In the case center = TRUE, this is the same as the standard deviation, but in general it is not. (To scale by the standard deviations without centering, use scale(x, center = FALSE, scale = apply(x, 2, sd, na.rm = TRUE)).)

 

Value

For scale.default, the centered, scaled matrix. The numeric centering and scalings used (if any) are returned as attributes "scaled:center" and "scaled:scale"

 

scale方法默认进行z-score标准化,先减去均值,再除以标准差

z-score 标准化(zero-mean normalization)

也叫标准差标准化,这种方法给予原始数据的均值(mean)和标准差(standard deviation)进行数据的标准化。

经过处理的数据符合标准正态分布,即均值为0,标准差为1,其转化函数为:

 

其中μ为所有样本数据的均值,σ为所有样本数据的标准差。

 

二、unscale函数

DMwR中函数unscale可以根据scale的返回对象,还原数据

Usage

unscale(vals, norm.data, col.ids)

 

Arguments

vals: A numeric matrix with the values to un-scale

norm.data: A numeric and scaled matrix. This should be an object to which the function scale() was applied.

col.ids: The columns of the vals matrix that are to be un-scaled (defaults to all of them).

 

Value

An object with the same dimension as the parameter vals

 

三、使用示例

> df<-data.frame(x=c(1,2,3),y=c(2,4,6),z=c(3,6,9))

> df

  x y z

1 1 2 3

2 2 4 6

3 3 6 9

> scaledData<-scale(df)

> scaledData

      x  y  z

[1,] -1 -1 -1

[2,]  0  0  0

[3,]  1  1  1

attr(,"scaled:center")

x y z

2 4 6

attr(,"scaled:scale")

x y z

1 2 3

> unscale(scaledData,scaledData)

     x y z

[1,] 1 2 3

[2,] 2 4 6

[3,] 3 6 9

> ndf<-data.frame(x=c(1,2),y=c(2,4),z=c(3,6))

> ndf

  x y z

1 1 2 3

2 2 4 6

> scale(ndf,center=attr(scaledData, "scaled:center"),scale=attr(scaledData, "scaled:scale"))

      x  y  z

[1,] -1 -1 -1

[2,]  0  0  0

attr(,"scaled:center")

x y z

2 4 6

attr(,"scaled:scale")

x y z

1 2 3


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
《R语言数据分析与挖掘实战》——第2章 R语言简介 2.1 R安装发布时间:2022-07-18
下一篇:
R语言的日期运算发布时间:2022-07-18
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap