DiffUtil DOES NOT OPERATE ON DATA

November 5th 2016

As it is extremely important to stress I have made it the title of this blog post. That is that DiffUtil does not operate on data. There is a lot of confusion when I am talking with developers as to what exactly DiffUtil does. A lot of developers believe it will help you transform data sets but truly it is only for calling appropriate update methods on a recyclerview. That said, this post will also cover in depth DiffUtill all the way from an easy how to use tutorial (which we will start with) to getting slightly into how Eugene Myer's Difference Algorithm works. I highly recommend reading it when you get a chance.

So to start out with (as most tutorials do) let use quickly discuss getting started with DiffUtil.

			
compile 'com.android.support:recyclerview-v7:25.0.0'
			
		

That obvoiusly will help you add the dependency as it is a part of the recyclerview package from the support library. Which is also a good indicator of why DiffUtil does not operate on data.

The next thing we will do is just find somewhere in your code to place the code below. Maybe in the onCreate of your activity or something. Also while putting this in, read through the javadocs as those are a part of this blog post as well, hence them being so in depth.

			
/*We will use the actual values 
from Eugene Myers' algorithm. 
These values were used in his 
paper I linked.So first we create 
list A which is what we have now.
Then we create list B which is what
 we want list A to transform into.
You might view list A as what is 
in some adapter and list B as a new
 network request we want to display.*/
final char[] A = new char[] {'a', 'b', 'c', 'a', 'b', 'b', 'a'};
final char[] B = new char[] {'c', 'b', 'a', 'b', 'a', 'c'};
/*Next we call upon DiffUtil, 
which will run Eugene's algorithm 
as well as some extra logic
to calculate moves 
(Thanks to Yigit Boyar) and calculate
snakes (talked about later)
To return what deletes inserts and 
moves we need to do.*/
DiffUtil.calculateDiff(new DiffUtil.Callback() {
    /**
     * The calculation needs the old 
     * list length to setup 1 (x) side 
     * of the matrix (talked about later)
     * @return
     */
    @Override
    public int getOldListSize() {
        return A.length;
    }
    /**
     * The calculation also needs the
     * new list that we need to turn 
     * list A into, this will also help
     * setup the other side of the matrix
     * (y) (talked about later)
     * @return
    */
    @Override
    public int getNewListSize() {
        return B.length;
    }
    /**
     * Then we have a concept of if the
     * items are the same and contents, 
     * this method checks if the item 
     * disregarding its contents is the 
     * same item so lets say you have a 
     * view holder. The view holder has 
     * something that needs to update
     * maybe say the score of a game, or 
     * someones logged in or logged out 
     * status. This is where you would return
     * that the cell is the same person but 
     * the content has changed, in the fact 
     * that the user may have logged in and 
     * it is time to change an
     * item on the viewholder to green.
     * @param oldItemPosition The old items 
     *   position in List A. Use this value and 
     *   get an item out of the old list to check
     *   against with newItemPosition
     * @param newItemPosition The new items 
     *   position in List B. This is how you 
     *   get the item out of List B and check 
     *   against List A.
     * @return The return value true, indicates 
     *   that indeed what is talked about above 
     *   is true and we need to not necassrily 
     *   move or delete this item but rather update it.
    */
    @Override
    public boolean areItemsTheSame(int oldItemPosition, int newItemPosition) {
        return A[oldItemPosition] == B[newItemPosition];
    }

    /**
     * Now this is similar to above
     * but is the completion of the
     * though process as in it returns
     * whether the contents of the
     * items are the same. So for 
     * instance if areItemsSame returned
     * false, then this method might be
     * ignored, but if it returned true
     * then this method might be called
     * and then we would return false. 
     * Stating that the item needs to be
     * updated. Or we could return true, 
     * stating that the item does not need
     * to be updated and is exactly the 
     * same. (Creating a snake, also 
     * talked about later).
     * @param oldItemPosition Same as 
     *    aboves description of this parameter.
     * @param newItemPosition Same as 
     *    aboves description of this parameter.
     * @return Spoken about in the 
     *    description of this method above.
    */
    @Override
    public boolean areContentsTheSame(int oldItemPosition, int newItemPosition) {
        return A[oldItemPosition] == B[newItemPosition];
    }
});
			
		

Now this is where it gets interesting, you normally see on blog posts about how you simply call the below method.

			
.dispatchUpdatesTo(mAdapter)
			
		

And that is indeed the correct way you should do it, but it requires that List B, has already been put into the backing data store of the adapter. So for instance you would need to clear out the items of mAdapters backing store, and then addAll(listB) so that way the items in the adapters backing store reflect listB even though the recycler view has not been updated yet.

That probably sounds a little confusing, but again that is where this blog post is to emphasize that DiffUtil does not operate on data. Which we will get into deeper now by implementing the other option we are given in the dispatchUpdatesTo method which is to hand in a ListUpdateCallback.

So with that, go ahead and add the following code to your example as well as read through the javadocs to understand what is going on.

			
/*Call the dispatch updates
method and hand in our own 
custom ListUpdateCallback.
When we hand in an adapter, 
the method simply creates a 
wrapper around the adapter, 
and that wrapper is a 
ListUpdateCallback.*/
.dispatchUpdatesTo(new ListUpdateCallback() {
    /**
     * So this is the onInserted method
     * as you can see and this is where
     * we can plainly see why DiffUtil is
     * not meant to operate on data. 
     * Eugene's algorithm, when doing an 
     * insert command will actually tell 
     * you what you are inserting.
     * But if you notice the two parameters
     * for this method, they are simply 
     * integers. There is no way for us to 
     * know what is being inserted. 
     * So we have to trust that the item 
     * at the position we are handed is 
     * already updated correctly in the 
     * recycler views backing data store. 
     * And if you look at how the other 
     * dispatchUpdatesToMethod which takes 
     * in a adapter works, it simply calls
     * notifyItemInserted. We are given the 
     * count here to keep calling it at that
     * position correctly. We will look into 
     * more later in the
     * blog post at the actual algorithm what
     * this method would be if we were to 
     * operate on data.
     * @param position The position to call 
     *   notifyItemInserted on.
     * @param count How many times to call 
     *   notifyItemInserted at the position.
     *   (or alternatively, call 
     *   notifyItemRangeInserted)
    */
    @Override
    public void onInserted(int position, int count) {
        Log.i("DiffUtil", "INSERT : pos " + position + " : count " + count);
    }

    /**
     * Now this method is simply the remove 
     * part of the algorithm and is as simple
     * as it sounds you call notifyItemRemoved
     * or notifyItemRangeRemoved at the 
     * specified position. This method, however,
     * were we trying to make this actually 
     * follow Eugene Myers' algorithm, would be
     * all we need. It reflects well what the 
     * algorithm calls for and this will make 
     * more sense as I explain at the bottom 
     * of the blog post how the algorithm works.
     * @param position
     * @param count
    */
    @Override
    public void onRemoved(int position, int count) {
        Log.i("DiffUtil", "REMOVE : pos " + position + " : count " + count);
    }

    /**
     * Now this is the part of the algorithm 
     * that Yigit Boyar wrote from scratch. 
     * Eugene Myers' algorithm doesn't handle
     * moves, Yigit altered the implementation
     * Eugene provided to support moving an 
     * item which as you can image this method
     * was hard to implement.
     * This methods purpose as you can kind of
     * tell is to call notifyItemMoved on the recyclerview
     * @param fromPosition in list a the 
     *   position that needs to move.
     * @param toPosition where in list a 
     *   the position needs to move to.
    */
    @Override
    public void onMoved(int fromPosition, int toPosition) {
        Log.i("DiffUtil", "MOVED : from " + fromPosition + " : to " + toPosition);
    }

    /**
     * Then this method as it sounds is for
     * listening when something has changed. 
     * Which as you can imagin is for when 
     * something maybe is the same item but 
     * content changed. This would call the 
     * notifyItemChanged or 
     * notifyItemRangeChanged methods.
     * @param position The position that 
     *    changed in list a.
     * @param count the count of items 
     *    that changed for if you need to 
     *    do a rangeItemChanged call.
     * @param payload The object payload 
     *    to deal with the change, which is 
     *    a custom object returned from 
     *    DiffUtil.Callback.getChangePayload
    */
    @Override
    public void onChanged(int position, int count, Object payload) {
        Log.i("DiffUtil", "CHANGED : pos " + position + " : count " + count + " : payload " + payload);
    }
});
			
		

So now if we were to look at the output of the code we get this from the logs.

			
DiffUtil: INSERT : pos 7 : count 1
DiffUtil: MOVED : from 5 : to 3
DiffUtil: REMOVE : pos 0 : count 2
			
		

Which if you follow the trace of it we can see that the algorithm is indeed turning List A into List B. So now that you have seen how to use DiffUtil in depth, lets explain a little about how the actual algorithm works. Again this is super high level and if you read the paper it is worth it but hopefully this can be a good start to help you understand what happens under the covers.

The way the algorithm bases how to do comparisons is by creating an edit graph, that edit graph looks like a matrix really. So you can imagine an N by M graph where N is list A and list B is M. So with this NxM graph we need to create an edit script. That edit script is how we can get from 0,0 on the graph to NxM. So as we know with graphs, we have to choose certain paths, so in this graph you can do diagnol moves, down, or right (if 0, 0 conceptually is at the top left). Now each of those 3 types of moves say something about how a data set needs to changed based off of rules that Eugene Myers' made and you can read more in depth in the paper.

Another idea to know when working with DiffUtil is this edit script or path to get from 0,0 to NxM is a set of operations to alter List A. So the algorithm is always altering List A, not List B. So that again is also why for the inserts we cant use DiffUtil on data. Since the insert method doesnt tell us what to insert we cant, however Eugene Myers' algorithm does tell use what to insert. So if you look on page 3 of Eugene's paper I linked you can see the Edit script. Where you see 3IB that means at position 3 insert the value B into List A. That value part is where the algorithm tells us what got inserted.

So this is in depth with DiffUtil, I wont go into the algorithm's deep parts and explain the inner workings of how the edit script gets generated, that is for the reader to do more research on, (and its worth it) but for now I would get familiar with the ListUpdateCallback to understand a little more under the covers what goes on with this algorithm.

One more thing worth noting is that the algorithm is greate for mobile devices and not display huge amounts of data as mobile devices normally do not. The algorithms worst case running though is O(nlog(n)+D^2) which you can read about in the paper more, this is not going to be most peoples use cases though hence the class reference only showing O(n*D^2).

Anyways thank you for reading hope you enjoy.

2 Timothy 1:7