Knowing that you have CPU-related issues in your app is one thing — doing something about it is the next challenge. In some respects, tuning an Android application is a “one-off” job, tied to the particulars of the application and what it is trying to accomplish. That being said, this chapter will outline some general-purpose ways of boosting performance that may counter issues that you are running into.
Understanding this chapter requires that you have read the core chapters and understand how Android apps are set up and operate. Reading the introductory chapter to this trail is also a good idea.
One class of CPU-related problems come from purely sluggish code. These are the sorts of things you will see in Traceview, for example – methods or branches of code that seem to take an inordinately long time. These are also some of the most difficult to have general solutions for, as often times it comes down to what the application is trying to accomplish. However, the following sections provide suggestions for consuming fewer CPU instructions while getting the same work done.
These are presented in no particular order.
Most of your algorithm fixes will be standard Java optimizations, no different than have been used by Java projects over the past decade and change. This section outlines a few of them. For more, consider reading Effective Java by Joshua Bloch or Java Performance Tuning by Jack Shirazi.
Few objects in java.*
namespaces are intrinsically thread-safe,
outside of java.util.concurrent
. Typically, you need to perform
your own synchronization if multiple threads will be accessing
non-thread-safe objects. However, sometimes, Java classes have
synchronization that you neither expect nor need. Synchronization
adds unnecessary overhead.
The classic example here is StringBuffer
and StringBuilder
.
StringBuffer
was part of Java from early on, and, for whatever
reason, was written to be thread-safe — two threads that append
to the buffer will not cause any problems. However, most of the time,
you are only using the StringBuffer
from one thread, meaning all
that synchronization overhead is a waste. Later on, Java added
StringBuilder
, with the same basic set of methods as has
StringBuffer
, but without the synchronization.
Similarly, in your own code, only synchronize where it is really
needed. Do not toss the synchronized
keyword around randomly, or
use concurrent collections that will only be used by one thread, etc.
The first generation of Android devices lacked a floating-point
coprocessor on the ARM CPU package. As a result, floating-point math
speed was atrocious. That is why the Google Maps add-on for Android
uses GeoPoint
, with latitude and longitude in integer microdegrees,
rather than the standard Android Location
class, which uses Java
double variables holding decimal degrees.
While later Android devices do have floating-point coprocessor support, that does not mean that floating-point math is now as fast as integer math. If you find that your code is spending lots of time on floating-point calculations, consider whether a change in units would allow you to replace the floating-point calculations with integer equivalents. For example, microdegrees for latitude and longitude provide adequate granularity for most maps, yet allow Google Maps to do all of its calculations in integers.
Similarly, consider whether the full decimal accuracy of floating-point values is really needed. While it may be physically possible to perform distance calculations in meters with accuracy to a few decimal points, for example, in many cases the user will not need that degree of accuracy. If so, perhaps changing to fixed-point (integer) math can boost your performance.
Years upon years of work has gone into the implementation of various algorithms that underlie Java methods, like searching for substrings inside of strings.
Somewhat less work has gone into the implementation of the Apache Harmony versions of those methods, simply because the project is younger, and it is a modified version of the Harmony implementation that you will find in Android. While the core Android team has made many improvements to the original Harmony implementation, those improvements may be for optimizations that do not fit your needs (e.g., optimizing to reduce memory consumption at the expense of CPU time).
But beyond that, there are dozens of string-matching algorithms, some of which may be better for you depending on the string being searched and the string being searched for. Hence, you may wish to consider applying your own searching algorithm rather than relying on the built-in one, to boost performance. And, this same concept may hold for other algorithms as well (e.g., sorting).
Of course, this will also increase the complexity of your application, with long-term impacts in terms of maintenance cost. Hence, do not assume the built-in algorithms are the worst, either — optimize those algorithms that Traceview or logging suggest are where you are spending too much time.
An easy “win” is to add android:hardwareAccelerated="true"
to your
<application>
element in the manifest. This toggles on hardware
acceleration for 2D graphics, including much of the stock widget
framework. For maximum backwards compatibility, this hardware
acceleration is off, but adding the aforementioned attribute will
enable it for all activities in your application.
Note that this is only available starting with Android 3.0. It is safe to have the attribute in the manifest for older Android devices, as they simply will ignore your request.
You also should test your application thoroughly after enabling
hardware acceleration, to make sure there are no unexpected issues.
For ordinary widget-based applications, you should encounter no
problems. Games or other applications that do their own drawing might
have issues. If you find that some of your code runs into problems,
you can override hardware acceleration on a per-activity basis by
putting the android:hardwareAccelerated
attribute on <activity>
elements in
the manifest.
Calling a method on an object in your own process is fairly inexpensive. The overhead of the method invocation is fairly minuscule, and so the time involved is simply however long it takes for that method to do its work.
Invoking behaviors in another process, via inter-process
communication (IPC), is considerably more expensive. Your request has
to be converted into a byte array (e.g., via
the Parcelable
interface),
made available to the other process, converted back into
a regular request, then executed. This adds substantial CPU overhead.
There are three basic flavors of IPC in Android:
bindService()
Using a remote service is fairly obvious when you do it — it is difficult to mistake copying the AIDL into your project and such. The proxy object generated from the AIDL converts all your method calls on the interface into IPC operations, and this is relatively expensive.
If you are exposing a service via AIDL, design your API to be coarse-grained. Do not require the client to make 1,000 method invocations to accomplish something that can be done in 1 via slightly more complex arguments and return values.
If you are consuming a remote service, try not to get into situations
where you have to make lots of calls in a tight loop, or per row of a
scrolled AdapterView
, or anything else where the overhead may
become troublesome.
For example, in the
CPU-Java/AIDLOverhead
sample project, you will find a pair of projects
implementing the same do-nothing method in equivalent services. One
uses AIDL and is bound to remotely from a separate client
application; the other is a local service in the client application
itself. The client then calls the do-nothing method 1 million times
for each of the two services. On average, on a Samsung Galaxy Tab
10.1, 1 million calls takes around 170 seconds for the remote
service, while it takes around 170 milliseconds for the local
service. Hence, the overhead of an individual remote method
invocation is small (~170 microseconds), but doing lots of them in a
loop, or as the user flings a ListView
, might become noticeable.
Using a content provider can be somewhat less obvious of a problem.
Using ContentResolver
or a CursorLoader
looks
the same whether it is your own content provider or someone else’s.
However, you know what content providers you wrote; anything else is
probably running in another process.
As with remote services, try to aggregate operations with remote content providers, such as:
bulkInsert()
rather than lots of individual insert()
callsupdate()
or delete()
in a tight loop
– instead, if the content provider supports it, use a more
complex “WHERE clause” to update or delete everything at onceThe content provider scenario is really a subset of the broader case where you request that Android do something for you and winds up performing IPC as part of that.
Sometimes, this is going to be obvious. If you are sending commands
to a third-party service via startService()
, by definition, this
will involve IPC, since the third-party service will run in a
third-party process. Try to avoid calling startService()
lots of
times in close succession.
However, there are plenty of cases that are less obvious:
startActivity()
, startService()
, and
sendBroadcast()
involve IPC, as it is a separate OS process that
does the real workBroadcastReceiver
(e.g.,
registerReceiver()
) involves IPCLocationManager
, are
really rich interfaces to an AIDL-defined remote service, and so most
operations on these system services require IPCOnce again, your objective should be to minimize calls that involve
IPC, particularly where you are making those calls frequently in
close succession, such as in a loop. For example, frequently calling
getLastKnownLocation()
will be expensive, as that involves IPC to a
system process.
The way that the Dalvik VM was implemented and operates is subtly different than a traditional Java VM. Therefore, there are some optimizations that are more important on Android than you might find in regular desktop or server Java.
The Android developer documentation has a roster of such optimizations. Some of the highlights include:
ViewHolder
objects for optimizing an Adapter, consider
skipping the accessor methods and just use the fields directly.indexOf()
on String
and arraycopy()
on System
are two cited
examples. These will run much faster than anything you might create
yourself in Java.Another class of CPU-related problem is when your code may be
efficient, but it is occurring on the main application thread,
causing your UI to react sluggishly. You might have tuned your
decryption algorithm as best as is mathematically possible, but it
may be that decrypting data on the main application thread simply
takes too much time. Or, perhaps StrictMode
complained about some
disk or network I/O that you are performing on the main application
thread.
The following sections recap some commonly-seen patterns for moving work off the main application thread, plus a few newer options that you may have missed.
Most developers think of having too many allocations as being solely an issue of heap space. That certainly has an impact, and depending on the nature of the allocations (e.g., bitmaps), it may be the dominant issue.
However, garbage has impacts from a CPU standpoint as well. Every object you create causes its constructor to be executed. Every object that is garbage-collected requires CPU time both to find the object in the heap and to actually clean it up (e.g., execute the finalizer, if any).
Worse still, on older versions of Android (e.g., Android 2.2 and down), the garbage collector interrupts the entire process to do its work, so the more garbage you generate, the more times you “stop the world”. Game developers have had to deal with this since Android’s inception. To maintain a 60 FPS refresh rate, you cannot afford any garbage collections on older devices, as a single GC run could easily take more than the ~16ms you have per drawing pass.
As a result of all of this, game developers have had to carefully manage their own object pools, pre-allocating a bunch of objects before game play begins, then using and recycling those objects themselves, only allowing them to become garbage after game play ends.
Most non-game Android applications may not have to go to quite that
extreme across the board. However, there are cases where excessive
allocation may cause you difficulty. For example, avoiding creating
too much garbage is one aspect of view recycling with AdapterView
,
which is covered in greater detail in the next section.
If Traceview indicates that you are spending a lot of time in garbage
collection, pay attention to your loops or things that may be invoked
many times in rapid succession (e.g., accessing data from a custom
Cursor
implementation that is tied to a CursorAdapter
). These are
the most likely places where your own code might be creating lots of
extra objects that are not needed. Examining the heap to see what is
all being created (and eventually garbage collected) will be covered
in an upcoming chapter of the book.
Perhaps the best-covered Android-specific optimization is view
recycling with AdapterView
.
In a nutshell, if you are extending BaseAdapter
, or if you are
overriding getView()
in another adapter, please make use of the
View
parameter supplied to getView()
(referred to here as
convertView
). If convertView
is not null
, it is one of your
previous View
objects you returned from getView()
before, being
offered to you for recycling purposes. Using convertView
saves you
from inflating or manually constructing a fresh View
every time the
user scrolls, and both of those operations are relatively expensive.
If you have been ignoring convertView
because you have more than
one type of View
that getView()
returns, your Adapter
should be
overriding getViewTypeCount()
and getItemViewType()
. These will
allow Android to maintain separate object pools for each type of row
from your Adapter
, so getView()
is guaranteed to be passed a
convertView
that matches the row type you are trying to create.
A somewhat more advanced optimization — caching all those
findViewById()
lookups — is also possible once your row
recycling is in place. Often referred to as “the holder pattern”, you
do the findViewById()
calls when you inflate a new row, then attach
the findViewById()
results to the row itself via some custom
“holder” object and the setTag()
method on View
. When you recycle
the row, you can get your “holder” back via getTag()
and skip
having to do the findViewById()
calls again.
Of course, the backbone of any strategy to move work off the main
application thread is to use background threads, in one form or
fashion. You will want to apply these in places where StrictMode
complains about network or disk I/O, or places where Traceview or
logging indicate that you are taking too much time on the main
application thread during GUI processing (e.g., converting downloaded
bitmap images into Bitmap
objects via BitmapFactory
).
Sometimes, you will manually dictate where work should be done in the
background, either by forking threads yourself or by using
AsyncTask
. AsyncTask
is a nice framework, handling all of the
inter-thread communication for you and neatly packaging up the work
to be done in readily understood methods. However, AsyncTask
does
not fit every scenario — it is mostly designed for
“transactional” work that is known to take a modest amount of time
(milliseconds to seconds) then end. For cases where you need
unbounded background processing, such as monitoring a socket for
incoming data, forking your own thread will be the better approach.
Sometimes, you will use facilities supplied by Android to move work
to the background. For example, many activities are backed by a
Cursor
obtained from a database or content provider. Classically,
you would manage the cursor (via startManagingCursor()
) or
otherwise arrange to refresh that Cursor
in onResume()
, so when
your activity returns to the foreground after having been gone for a
while, you would have fresh data. However, this pattern tends to lead
to database I/O on the main application thread, triggering complaints
from StrictMode
. Android 3.0 and the Android Compatibility Library
offer a Loader
framework designed to try to solve the core pattern
of refreshing the data, while arranging for the work to be done
asynchronously.
99.44% of the time (approximately) that Android calls your code in
some sort of event handler, you are being called on the main
application thread. This includes manifest-registered
BroadcastReceiver
components — onReceive()
is called on the main
application thread. So any work you do in onReceive()
ties up that
thread (possibly impacting an activity of yours in the foreground),
and if you take more than 10 seconds, Android will terminate your
BroadcastReceiver
with extreme prejudice.
Classically, manifest-registered BroadcastReceiver
components only
live as long as the onReceive()
call does, meaning you can do very
little work in the BroadcastReceiver
itself. The typical pattern is
to have it send a command to a service via startService()
, where
the service “does the heavy lifting”.
Android 3.0 added a goAsync()
method on BroadcastReceiver
that
can help a bit here. While under-documented, it tells Android that
you need more time to complete the broadcast work, but that you can
do that work on a background thread. This does not eliminate the
10-second rule, but it does mean that the BroadcastReceiver
can do
some amount of I/O without having to send a command to a service to
do it while still not tying up the main application thread.
The
CPU-Java/GoAsync
sample project demonstrates goAsync()
in use, as the project
name might suggest.
Our activity’s layout consists of two Button
widgets and an
EditText
widget:
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:orientation="vertical" android:layout_width="match_parent"
android:layout_height="match_parent">
<EditText android:id="@+id/editText1" android:layout_width="match_parent"
android:layout_height="wrap_content">
</EditText>
<Button android:layout_width="match_parent" android:id="@+id/button1"
android:layout_height="wrap_content" android:text="@string/nonasync"
android:onClick="sendNonAsync"></Button>
<Button android:layout_width="match_parent" android:id="@+id/button2"
android:layout_height="wrap_content" android:text="@string/async"
android:onClick="sendAsync"></Button>
</LinearLayout>
The activity itself simply has sendAsync()
and sendNonAsync()
methods, each invoking sendBroadcast()
to a different
BroadcastReceiver
implementation:
package com.commonsware.android.tuning.goasync;
import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.view.View;
public class GoAsyncActivity extends Activity {
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
}
public void sendAsync(View v) {
sendBroadcast(new Intent(this, AsyncReceiver.class));
}
public void sendNonAsync(View v) {
sendBroadcast(new Intent(this, NonAsyncReceiver.class));
}
}
The NonAsyncReceiver
simulates doing time-consuming work in
onReceive()
itself:
package com.commonsware.android.tuning.goasync;
import android.content.BroadcastReceiver;
import android.content.Context;
import android.content.Intent;
import android.os.SystemClock;
public class NonAsyncReceiver extends BroadcastReceiver {
@Override
public void onReceive(Context arg0, Intent arg1) {
SystemClock.sleep(7000);
}
}
Hence, if you click the “Send Non-Async Broadcast” button, not only
will the button fail to return to its normal state for seven seconds,
but the EditText
will not respond to user input either.
The AsyncReceiver
, though, uses goAsync()
:
package com.commonsware.android.tuning.goasync;
import android.content.BroadcastReceiver;
import android.content.Context;
import android.content.Intent;
import android.os.SystemClock;
public class AsyncReceiver extends BroadcastReceiver {
@Override
public void onReceive(Context context, Intent intent) {
final BroadcastReceiver.PendingResult result=goAsync();
(new Thread() {
public void run() {
SystemClock.sleep(7000);
result.finish();
}
}).start();
}
}
The goAsync()
method returns a PendingResult
, which supports a
series of methods that you might ordinarily fire on the
BroadcastReceiver
itself (e.g., abortBroadcast()
) but want to do
on a background thread. You need your background thread to have
access to the PendingResult
— in this case, via a final
local variable. When you are done with your work, call finish()
on
the PendingResult
.
If you click the “Send Async Broadcast” button, even though we are still sleeping for 7 seconds, we are doing so on a background thread, and so our user interface is still responsive.
The classic way to save SharedPreferences.Editor
changes was via a
call to commit()
. This writes the preference information to an XML
file on whatever thread you are on — another hidden source of
disk I/O you might be doing on the main application thread.
If you are on API Level 9, and you are willing to blindly try saving
the changes, use the new apply()
method on
SharedPreferences.Editor
, which works asynchronously.
If you need to support older versions of Android, or you really want
the boolean return value from commit()
, consider doing the
commit()
call in an AsyncTask
or background thread.
And, of course, to support both of these, you will need to employ
tricks like conditional class loading. You can see that used for
saving SharedPreferences
in the
CPU-Java/PrefsPersist
sample project. The activity reads in a
preference, puts the current value on the screen, then updates the
preference with the help of an AbstractPrefsPersistStrategy
class
and its persist()
method:
package com.commonsware.android.tuning.prefs;
import android.app.Activity;
import android.content.SharedPreferences;
import android.os.Bundle;
import android.preference.PreferenceManager;
import android.widget.TextView;
public class PrefsPersistActivity extends Activity {
private static final String KEY="counter";
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
SharedPreferences prefs=
PreferenceManager.getDefaultSharedPreferences(this);
int counter=prefs.getInt(KEY, 0);
((TextView)findViewById(R.id.value)).setText(String.valueOf(counter));
AbstractPrefsPersistStrategy.persist(prefs.edit().putInt(KEY, counter+1));
}
}
AbstractPrefsPersistStrategy
is an abstract base class that will
hold a strategy implementation, depending on Android version. On
pre-Honeycomb builds, it uses an implementation that forks a
background thread to perform the commit()
:
package com.commonsware.android.tuning.prefs;
import android.content.SharedPreferences;
import android.os.Build;
abstract public class AbstractPrefsPersistStrategy {
abstract void persistAsync(SharedPreferences.Editor editor);
private static final AbstractPrefsPersistStrategy INSTANCE=initImpl();
public static void persist(SharedPreferences.Editor editor) {
INSTANCE.persistAsync(editor);
}
private static AbstractPrefsPersistStrategy initImpl() {
int sdk=new Integer(Build.VERSION.SDK).intValue();
if (sdk<Build.VERSION_CODES.HONEYCOMB) {
return(new CommitAsyncStrategy());
}
return(new ApplyStrategy());
}
static class CommitAsyncStrategy extends AbstractPrefsPersistStrategy {
@Override
void persistAsync(final SharedPreferences.Editor editor) {
(new Thread() {
@Override
public void run() {
editor.commit();
}
}).start();
}
}
}
On Honeycomb and higher, it uses a separate strategy class that uses
the new apply()
method:
package com.commonsware.android.tuning.prefs;
import android.content.SharedPreferences.Editor;
public class ApplyStrategy extends AbstractPrefsPersistStrategy {
@Override
void persistAsync(Editor editor) {
editor.apply();
}
}
By separating the Honeycomb-specific code out into a separate class,
we can avoid loading it on older devices and encountering the dreaded
VerifyError
.
Whether using the built-in apply()
method is worth dealing with
multiple strategies, versus simply calling commit()
on a background
thread, is up to you.
Being efficient and doing work on the proper thread may still not be
enough. It could be that your work is not consuming excessive CPU
time, but is taking too long in “wall clock time” (e.g., the user
sits waiting too long at a ProgressDialog
). Or, it could be that
your work, while efficient and in the background, is causing
difficulty for foreground operations.
The following sections outline some common problems and solutions in this area.
Earlier in this book, we emphasized moving disk writes off to background threads.
Even better is to get rid of some of the disk writes entirely.
A big culprit here comes in the form of database operations. By
default, each insert()
, update()
, or delete()
, or any
execSQL()
invocation that modifies data, will occur in its own
transaction. Each transaction involves a set of disk writes. Many
times, this is not a problem. But, if you are doing a lot of these
– such as importing records from a CSV file — hundreds or
thousands of transactions will mean thousands of individual disk
writes, and that can take some time. You may wish to wrap those
operations in your own transaction, using methods like
beginTransaction()
, simply to reduce the number of transactions
and, therefore, disk writes.
If you are doing your own disk I/O beyond databases, you may encounter similar sorts of issues. Overall, it is better to do a few larger writes than lots of little ones.
Threads you fork, by default, run at a default priority:
THREAD_PRIORITY_DEFAULT
as defined on the Process
class. This is
a lower priority than the main application thread
(THREAD_PRIORITY_DISPLAY
).
Threads you use via AsyncTask
run at a lower priority
(THREAD_PRIORITY_BACKGROUND
). If you fork your own threads, then,
you might wish to consider moving them to a lower priority as well,
to affect how much time they get compared to the main application
thread. You can do this via setThreadPriority()
on the Process
class.
The lowest possible priority, THREAD_PRIORITY_LOWEST
, is described
as “only for those who really, really don’t want to run if anything
else is happening”. You might use this for “idle-time processing”,
but bear in mind that the thread will be paused a lot to allow other
threads to run.
Lower-priority threads will help ensure that your background work does not affect your foreground UI. Processes themselves are put in a lower-priority class as they move to the background (e.g., you have no activities visible), which further reduces the amount of CPU time you will be using at any given moment.
Also, note that IntentService
uses a thread at default (not
background) priority — you may wish to drop the priority of
this thread to something that will be lower than your main
application thread, to minimize how much CPU time the IntentService
steals from your UI.
Just because you could do the work now does not mean you should do the work now. Perhaps a better answer is to do the work later, or do part of the work now and part of the work later.
For example, suppose that you have your own database of points of interest for your custom map application. Periodically, you publish a new database on your Web site, which your Android app should download. Odds are decent that the user is not in desperate need for this new database right away. In fact, the CPU time and disk I/O time to download and save the database might incrementally interfere with the foreground application, despite your best efforts.
In this case, not only should you check for and download the database
when the user is unlikely to be using the device (e.g., before dawn),
but you should check whether the screen is on via isScreenOn()
on
PowerManager
, and delay the work to sometime when the screen is
off. For example, you could have AlarmManager
set up to have your
code check for updates every 24 hours at 4am. If, at 4am, the screen
is on, your code could skip the download and wait until tomorrow, or
skip the download and add a one-shot alarm to wake you up in 30
minutes, in hopes that the user will no longer be using the device.
At the same time, you may wish to consider having a “refresh” menu choice somewhere, for when the user specifically wants you to go get the update (if available) now, for whatever reason.