144. Discovering mismatches between two files

The solution to this problem is comparing the content of two files (a byte by byte comparison) until the first mismatch is found or the EOF is reached.

Let's consider the following four text files:

Only the first two files (file1.txt and file2.txt) are identical. Any other comparison should reveal the presence of at least one mismatch.

One solution is to use MappedByteBuffer. This solution is super-fast and easy to implement. We just open two FileChannels (one for each file) and perform a byte by byte comparison until we find the first mismatch or EOF. If the files don't have the same length in terms of bytes, then we assume that the files are not the same and return immediately:

private static final int MAP_SIZE = 5242880; // 5 MB in bytes

public static boolean haveMismatches(Path p1, Path p2)
throws IOException {

try (FileChannel channel1 = (FileChannel.open(p1,
EnumSet.of(StandardOpenOption.READ)))) {

try (FileChannel channel2 = (FileChannel.open(p2,
EnumSet.of(StandardOpenOption.READ)))) {

long length1 = channel1.size();
long length2 = channel2.size();

if (length1 != length2) {
return true;
}

int position = 0;
while (position < length1) {
long remaining = length1 - position;
int bytestomap = (int) Math.min(MAP_SIZE, remaining);

MappedByteBuffer mbBuffer1 = channel1.map(
MapMode.READ_ONLY, position, bytestomap);
MappedByteBuffer mbBuffer2 = channel2.map(
MapMode.READ_ONLY, position, bytestomap);

while (mbBuffer1.hasRemaining()) {
if (mbBuffer1.get() != mbBuffer2.get()) {
return true;
}
}

position += bytestomap;
}
}
}

return false;
}
JDK 13 has prepared the release of non-volatile MappedByteBuffers. Stay tuned!

Starting with JDK 12, the Files class has been enriched with a new method dedicated to pointing mismatches between two files. This method has the following signature:

public static long mismatch​(Path path, Path path2) throws IOException

This method finds and returns the position of the first mismatched byte in the content of two files. If there is no mismatch, then it returns -1:

long mismatches12 = Files.mismatch(file1, file2); // -1
long mismatches13 = Files.mismatch(file1, file3); // 51
long mismatches14 = Files.mismatch(file1, file4); // 60
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.60.158