大样本统计Java
文章发布较早,内容可能过时,阅读注意甄别。
# 题目
我们对 0 到 255 之间的整数进行采样,并将结果存储在数组 count 中:count[k] 就是整数 k 在样本中出现的次数。
计算以下统计数据:
- minimum :样本中的最小元素。
- maximum :样品中的最大元素。
- mean :样本的平均值,计算为所有元素的总和除以元素总数。
- median :
- 如果样本的元素个数是奇数,那么一旦样本排序后,中位数 median 就是中间的元素。
- 如果样本中有偶数个元素,那么中位数median 就是样本排序后中间两个元素的平均值。
- mode :样本中出现次数最多的数字。保众数是 唯一 的。
以浮点数数组的形式返回样本的统计信息 [minimum, maximum, mean, median, mode] 。与真实答案误差在 10-5 内的答案都可以通过。
示例 1:
输入:count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
输出:[1.00000,3.00000,2.37500,2.50000,3.00000]
解释:用count表示的样本为[1,2,2,2,3,3,3,3,3]。
最小值和最大值分别为1和3。
均值是(1+2+2+2+3+3+3+3) / 8 = 19 / 8 = 2.375。
因为样本的大小是偶数,所以中位数是中间两个元素2和3的平均值,也就是2.5。
众数为3,因为它在样本中出现的次数最多。
示例 2:
输入:count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
输出:[1.00000,4.00000,2.18182,2.00000,1.00000]
解释:用count表示的样本为[1,1,1,1,2,2,3,3,3,4,4]。
最小值为1,最大值为4。
平均数是(1+1+1+1+2+2+2+3+3+4+4)/ 11 = 24 / 11 = 2.18181818…(为了显示,输出显示了整数2.18182)。
因为样本的大小是奇数,所以中值是中间元素2。
众数为1,因为它在样本中出现的次数最多。
提示:
- count.length == 256
- 0 <= count[i] <= 109
- 1 <= sum(count) <= 109
- count 的众数是 唯一 的
# 思路
最小最大众数:遍历 平均:求和计数,注意溢出 中位数:二分
# 解法
class Solution {
public double[] sampleStats(int[] count) {
if (count == null || count.length != 256){
return new double[]{0,0,0,0,0};
}
int i, j, cnt = 0, mid, target = 0;
double[] res = new double[5];
double sum = 0;
res[0] = 256;
res[1] = -1;
long[] leftSum = new long[count.length];
for (i = 0; i < 256; i++){
if (res[0] == 256 && count[i] != 0 ){
res[0] = i;
}
if ( count[i] != 0){
res[1] = Math.max(res[1],i);
}
cnt += count[i];
sum += count[i] * 1.0 * i;//count[i] * i会先溢出,再加给sum
leftSum[i] = count[i];
if (i > 0){
leftSum[i] += leftSum[i - 1];
}
if (count[i] > count[((int)res[4])]){
res[4] = i;
}
}
res[2] = sum / cnt;
target = (cnt + 1) / 2;
i = 0;
j = 255;
while (i < j){
mid = (i + j) / 2;
if (leftSum[mid] >= target){
j = mid;
}else {
i = mid + 1;
}
}
if ((cnt & 1) == 0 && leftSum[i] == target){//偶数
for (j = i + 1; j < 256; j++){
if (count[j] != 0){
break;
}
}
res[3] = (i + j) * 1.0 / 2;
} else {
res[3] = i;
}
return res;
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# 总结
- 分析出几种情况,然后分别对各个情况实现