private class Itr implements Iterator<E> {
int cursor; // index of next element to return
int lastRet = -1; // index of last element returned; -1 if no such
int expectedModCount = modCount;
public boolean hasNext() {
return cursor != size;
}
@SuppressWarnings("unchecked")
public E next() {
checkForComodification();
int i = cursor;
if (i >= size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[lastRet = i];
}
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
ArrayList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
expectedModCount = modCount;
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
@Override
@SuppressWarnings("unchecked")
public void forEachRemaining(Consumer<? super E> consumer) {
Objects.requireNonNull(consumer);
final int size = ArrayList.this.size;
int i = cursor;
if (i >= size) {
return;
}
final Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length) {
throw new ConcurrentModificationException();
}
while (i != size && modCount == expectedModCount) {
consumer.accept((E) elementData[i++]);
}
// update once at end of iteration to reduce heap write traffic
cursor = i;
lastRet = i - 1;
checkForComodification();
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
for (String item: maxArrayList) {
if(testSet.contains(item)){
//TODO
}
}
测试结果如下
下面是1000万的list和20000的list去重两种方式所花的时间,可以看出使用set去重的效率要高很多。
1.list结合list去重时间:
14:52:02,408 INFO [RunTest:37] start test list:17-11-07 14:52:02
14:59:49,828 INFO [ListUtils:66] after deWight list size: 9980000
14:59:49,829 INFO [RunTest:39] end test list:17-11-07 14:59:49
2.list结合set去重时间:
14:59:53,226 INFO [RunTest:44] start test set:17-11-07 14:59:53
15:01:30,079 INFO [ListUtils:80] after deWight list size: 9980000
15:01:30,079 INFO [RunTest:46] end test set:17-11-07 15:01:30
下面是2500万的list和20000的list去重两种方式所花的时间,可以看出使用set去重的效率要更加的高,(数据量越大越明显)。
个人对set的大小为1500万也进行了测试,方案3,4的效率也是非常的高。
1.list结合list去重时间:
15:17:47,114 INFO [RunTest:35] start test list, start time: 17-11-07 15:17:47
15:49:04,876 INFO [ListUtils:57] after deWight list size: 24980000
15:49:04,877 INFO [RunTest:39] end test list, end time: 17-11-07 15:49:04
2.list结合set去重时间:
15:49:17,842 INFO [RunTest:44] start test set, start time: 17-11-07 15:49:17
15:53:22,716 INFO [ListUtils:71] after deWight list size: 24980000
15:53:22,718 INFO [RunTest:48] end test set, end time: 17-11-07 15:53:22
3. List结合Set去重(不是直接对list进行删除,而是组装新list,考虑到list删除效率低)
17:18:44,583 INFO [RunTest:57] start test set, start time: 17-11-22 17:18:44
17:18:54,628 INFO [ListUtils:92] after deWight list size: 23500000
17:18:54,628 INFO [RunTest:61] end test set, end time: 17-11-22 17:18:49
4.遍历过程中结合set去重:(个人最为推荐的原因之一,效率高到令人爽到不行)
15:17:45,762 INFO [RunTest:24] start test foreach list directly, start time: 17-11-07 15:17:45
15:17:47,114 INFO [RunTest:32] end test foreach list directly, end time: 17-11-07 15:17:47