Arrays.copyOf 和 System.arraycopy 源码解读

缘起

我们做项目的时候,往往使用

1
java.util.Arrays.copyOf(int[], int)

1
java.lang.System.arraycopy(Object src, int srcPos, Object dest, int desPos, int length)

进行数组复制.

这都是JDK为我们准备好的工具,怎么能不好好研究一下呢? 嘻嘻~

分析

首先,我们翻翻Arrays.copyof的源码, 分为基本类型和引用类型两类重载, 首先从基本类型的重载看起, 我们选取典型的int类型的重载

1
2
3
4
5
6
public static int[] copyOf(int[] original, int newLength) {
int[] copy = new int[newLength];
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}

​ 源码1

关于此源码的API注释是

1
Copies the specified array, truncating or padding with zeros (if necessary) so the copy has the specified length. For all indices that are valid in both the original array and the copy, the two arrays will contain identical values. For any indices that are valid in the copy but not the original, the copy will contain 0. Such indices will exist if and only if the specified length is greater than that of the original array.

翻译过来就是(嗯,我就是搬运工而已~)

1
拷贝指定的数组,截取(如果你只是想拷贝源数组的一部分的话)或者以0填充(如果你的目标数组的长度大于你想拷贝的数组长度(即api的newLength入参)的话). 对于在源数组和目标数组中都有效的索引,两个数组包含相同的值; 对于在目标数组中有效,但是在源数组中无效的索引,目标数组是0(这种情况只会在newLength>你拷贝源数组的长度时发生);

很好理解,对吧? 但是我们发现源码1中本质上是调用了System的arraycopy方法. 说起这个方法,JDK中很多地方最后落地实现的时候都是调用了这个方法(比如java.util.ArrayList中的add方法扩容的话会调用Arrays.copyOf,其本质最后就是调用了System.copyOf方法). 我们来看看这个方法的API, 这个方法是一个native方法. 我对native方法的理解就是——Java定接口(用javah产生.h文件),cpp做实现(需要引入此.h文件以及jni.h),然后将cpp文件编译成ddl,最后System.loadLibrary 将此ddl加载,则这个native方法就可以被java访问了,即native方法的角色是扩展java程序功能的. 我们后面再来看这个API, 我们继续将引用类型的 Arrays.copyOf方法讲完

下面是Arrays.copyOf方法的源码

1
2
3
4
@SuppressWarnings("unchecked")
public static <T> T[] copyOf(T[] original, int newLength) {
return (T[]) copyOf(original, newLength, original.getClass());
}

​ 源码2

跟进第三行

1
2
3
4
5
6
7
8
9
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
@SuppressWarnings("unchecked")
T[] copy = ((Object)newType == (Object)Object[].class)
? (T[]) new Object[newLength]
: (T[]) Array.newInstance(newType.getComponentType(), newLength);
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}

​ 源码3

源码3就是首先根据T[]这种类型创建空的目标数组. 注意,这里调用了Array的反射方法(这个方法本质最后也会调用一个native方法)以及获取数组元素的类型getComponentType(这个也是一个native方法)。

最后也是调用了System.arraycopy方法.

下面我们看看System.arraycopy 这个API, 官方注释是这样的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Copies an array from the specified source array, beginning at the specified position, to the specified position of the destination array. A subsequence of array components are copied from the source array referenced by src to the destination array referenced by dest. The number of components copied is equal to the length argument. The components at positions srcPos through srcPos+length-1 in the source array are copied into positions destPos through destPos+length-1, respectively, of the destination array.
If the src and dest arguments refer to the same array object, then the copying is performed as if the components at positions srcPos through srcPos+length-1 were first copied to a temporary array with length components and then the contents of the temporary array were copied into positions destPos through destPos+length-1 of the destination array.

If dest is null, then a NullPointerException is thrown.

If src is null, then a NullPointerException is thrown and the destination array is not modified.

Otherwise, if any of the following is true, an ArrayStoreException is thrown and the destination is not modified:

The src argument refers to an object that is not an array.
The dest argument refers to an object that is not an array.
The src argument and dest argument refer to arrays whose component types are different primitive types.
The src argument refers to an array with a primitive component type and the dest argument refers to an array with a reference component type.
The src argument refers to an array with a reference component type and the dest argument refers to an array with a primitive component type.
Otherwise, if any of the following is true, an IndexOutOfBoundsException is thrown and the destination is not modified:

The srcPos argument is negative.
The destPos argument is negative.
The length argument is negative.
srcPos+length is greater than src.length, the length of the source array.
destPos+length is greater than dest.length, the length of the destination array.
Otherwise, if any actual component of the source array from position srcPos through srcPos+length-1 cannot be converted to the component type of the destination array by assignment conversion, an ArrayStoreException is thrown. In this case, let k be the smallest nonnegative integer less than length such that src[srcPos+k] cannot be converted to the component type of the destination array; when the exception is thrown, source array components from positions srcPos through srcPos+k-1 will already have been copied to destination array positions destPos through destPos+k-1 and no other positions of the destination array will have been modified. (Because of the restrictions already itemized, this paragraph effectively applies only to the situation where both arrays have component types that are reference types.)

翻译过来就是

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
从src[srcPos,...,srcPos+length-1] 拷贝到 dest[desPos,...,desPos+length-1], 如果src和des是同一个数组对象的话,则实际发生的拷贝是先将src[srcPos,...,srcPos+length-1]拷贝到一个临时数组中去,然后再将该临时数组中的内容拷贝到dest[desPos,...,desPos+length-1]中去.
如果des是null的话,则抛出NullPointerException.
如果src为null,则抛出NullPointerException并且不会修改目标数组.
如果满足下面其中任何一个条件,则抛出ArrayStoreException并且不会修改目标
1. src不是数组对象.
2. dest不是数组对象.
3. src和dest虽然都是数组,但是它们的数组元素的类型为不同的基本类型.
4. src和dest虽然都是数组,但是一个是基本类型的数组,另一个却是引用类型的数组.
如果满足下面其中任何一个条件,则抛出IndexOutOfBoundsException异常,并且不会修改目标数组
1. srcPos < 0
2. desPos < 0
3. length < 0
4. srcPos+length > src.length
5. desPos+length > des.length
除此之外(没错,拷贝个数组就是这么多限制,没那么容易成功滴~),如果发生源数组元素到目标数组元素类型转换失败的话,则抛出ArrayStoreException异常. 注意,在发生此异常之前的拷贝不会回滚——已经生效了.

所以System.arraycopy方法可能会抛出三种异常

  1. NullPointerException
  2. IndexOutOfBoundsException
  3. ArrayStoreException

除了第三种异常,其余的异常发生都不会对目标数组造成任何修改.

有读者可能对src与des是相同数组对象的那段规定不是很清楚,我举个例子就秒懂了.

arr = [obj0, obj1, obj2, obj3]

现在对src调用 System.arraycopy(src,0, src,1,3)

则如果没有临时数组的介入的话,则代码应该是这样写的(想象一下,src和des是一个对象,即arr)

1
2
3
for(int i = 0; i<length; i++) {
des[desPos+i] = src[srcPos+i];
}

但是这样的话,则i=0的时候,arr[1]变成了obj0, i=2的时候,因为arr[1]已经变成了obj0, 所以arr[2]也变成了obj0, 然后i=3的时候,因为arr[2]已经变成了obj0而arr[3]也是 obj0. 也就是如果按照没有临时数组介入的情形来调用api的话,src最后变成了 [obj0, obj0, obj0, obj0]

It is so WIERLD!!!

所以为了避免这种情况,该API的设计者才会引入了临时数组,则调用API最后的结果依然是

[obj0, obj0, obj1, obj2]

这才显得合理.

那既然是拷贝,一个问题是无可避免的——到底对于数组元素而言是深拷贝还是浅拷贝呢?

我不说话,例子说话.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
package com.yfs;

public class Test {

public static void main(String[] args) {
Person[] arr1 = { new Person("A"), new Person("B"), new Person("C") };
Person[] arr2 = new Person[3];
System.arraycopy(arr1, 0, arr2, 0, arr1.length);
arr2[2].setName("CCCCC");

for (int i = 0; i < arr2.length; i++) {
System.out.print(arr2[i].getName() + " "); // A B CCCCC
}

System.out.println();

System.out.println(arr1 == arr2); // false

for (int i = 0; i < arr1.length; i++) {
System.out.print(arr1[i].getName() + " "); // A B CCCCC
}
}

public static class Person {
private String name;

public Person(String name) {
this.name = name;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

}

}

足以说明:

1
2
1. 对于数组元素是引用类型的,是浅拷贝.
2. 对于数组元素是基本类型的, 是深拷贝.

ps: 个人感觉,深复制的代价是比较大的~ 总感觉能不深复制,就不要深复制~ 所以才那么多默认浅拷贝

最后,我们熟知的复制数组的方法有

1、for循环,手动复制
2、System.arraycopy()方法
3、Arrays.copyOf()方法
4、clone()方法

四种,那么效率几何呢?

由于System.arraycopy()是最贴近底层的,其使用的是内存复制,省去了大量的数组寻址访问等时间,故效率最高。Arrays.copyOf()根据刚才的分析, 其实就是对System.arraycopy的一层封装,故效率次于System.arraycopy().

对于for循环进行手动复制,主要浪费在了寻址和赋值上. 效率第三, 最差的就是clone了(默认浅拷贝而不是深拷贝,深拷贝需要递归clone).

前三种对于数组不大的情况下,效率差别不大,对于叫较大的数组,高下立判~