Working Directly with the Camera

Letting third-party apps take the pictures and videos for you is all well and good, but there will be times where you need more control than that. It is possible for you to work directly with the device cameras. However, doing is exceptionally complicated.

Part of that complexity is because Android presently has three separate APIs for working with the camera:

This chapter will attempt to outline the basic steps for using these APIs.


This chapter assumes that you have read the previous chapter covering Intent-based uses of the camera and the chapter on audio recording.

Notes About the Code Snippets

The code snippets shown in this chapter are here purely to illustrate how to call certain APIs. They are not from any particular sample project, as a sample project small enough to fit in a book would be riddled with bugs and limitations.

A Tale of Two APIs

As noted in the introduction to this chapter, there are three APIs for working with the camera. One — MediaRecorder — is focused purely on recording videos. It relies on you using one of the other two APIs for setting up the camera preview, so the user can see what will be recorded. Those other two APIs exist for taking still photos, where one (android.hardware.camera2) is substantially newer.


The original camera API is based around the android.hardware.Camera class.

(NOTE: there is another Camera class, in, that is not directly related to taking pictures)

Instances of this class represent an open camera, where you call methods on the Camera to do things like take pictures. You also work extensively with a Camera.Parameters object, where you can determine a number of key characteristics about the camera (e.g., what the available resolutions are for pictures) and set up the particular results that you want.

This API works on all Android devices.


The original camera API worked, albeit with some difficulty. However, it was fairly limited, as it was designed primarily around the smartphone camera capabilities of 2005-2010. Nowadays, device manufacturers have access to much more powerful camera modules from chipset manufacturers like Qualcomm. Android needed a more powerful API to accommodate the current hardware, and a more flexible API to be able to adjust to changes over time.

Hence, Android 5.0 brought a new API, based on a series of classes in the android.hardware.camera2 package. On the plus side, these offer much greater capability. They are also designed with asynchronous work in mind, off-loading slow or complex operations onto background threads for you. However, on the whole, the API is more complicated, much less documented, and substantially different than the original API.

It is also only available on Android 5.0 devices. If your minSdkVersion is 21 or higher, that is not a problem. If, however, you are aiming to support older devices than that, you have two choices:

  1. Stick with the original API for all devices
  2. Use the original API for older devices and the newer API for newer devices

The latter might allow you to offer more features to users of those newer devices, but it does roughly double the work required to implement camera logic in your app.


MediaRecorder is responsible for both audio recording and video recording. MediaRecorder has a fairly limited API, one that has not changed substantially since 2011. However, if you use it carefully, it works. It works in tandem with either camera API — you use the camera APIs to show the user what will be recorded, and you use MediaRecorder to actually do the recording.

However, MediaRecorder has a number of issues, such as a fair bit of delay between when you ask it to begin recording and when it actually does begin recording. This makes it a poor choice for fast-twitch video recording purposes. Some apps, notably Vine, have elected to skip using MediaRecorder. Instead, they use the regular camera APIs. These APIs, among other things, give you access to the preview frames that are used to show the user what is visible through the camera lens. With a fair amount of work, you can stitch those together into a video. Needless to say, this is a beyond-advanced topic that is well outside the scope of this book.

The APIs That You (Probably) Can’t Use

The aforementioned APIs are all part of the Android SDK. For camera apps that ship with devices, those apps are not limited to these APIs. Device manufacturers are welcome to create apps that use internal proprietary APIs for their devices.

Hence, when it comes to determining what is and is not possible through the camera APIs, it is important to compare to other third-party camera apps, more so than manufacturer-supplied apps. Manufacturers can “cheat”; you cannot.

Performing Basic Camera Operations

Cameras have some key functionality:

In the following sections, we will outline what is required to perform these operations using the various APIs.


First, you need permission to use the camera. That way, when end users install your application, they will be notified that you intend to use the camera, so they can determine if they deem that appropriate for your application.

You simply need the CAMERA permission in your AndroidManifest.xml file, along with whatever other permissions your application logic might require.

If you plan to record video, using MediaRecorder, you will also want to request the RECORD_AUDIO permission.

And, if you were planning on storing pictures or videos out on external storage, you probably need the WRITE_EXTERNAL_STORAGE permission. The exception would be if your minSdkVersion is 19 or higher and you are only storing those files in locations that are automatically read/write for your app, such as getExternalFilesDir() or getExternalCacheDir().

Note that all three of these permissions (CAMERA, RECORD_AUDIO, and WRITE_EXTERNAL_STORAGE) are part of the Android 6.0 runtime permission system. If your app has a targetSdkVersion of 23 or higher, you will need to request those permissions at runtime. If your app has a lower targetSdkVersion, while you will not have to do anything special for your app, bear in mind that the user can still revoke your access to those capabilities, and so you may find lots of devices that claim to support a camera but just do not seem to have any cameras available when you try to use one.


Your manifest also should contain one or more <uses-feature> elements, declaring what you need in terms of camera hardware. By default, asking for the CAMERA permission indicates that you need a camera. More specifically, asking for the CAMERA permission indicates that you need an auto-focus camera.

The following sections outline some common scenarios and how to handle them.

A Camera is Optional

If you would like a camera, but having one is not essential for the use of your app, put the following <uses-feature> element in your manifest:

<uses-feature android:name="" android:required="false" />

This indicates that you would like a camera, but it is not required. This reverses the default established by the CAMERA permission.

A Camera is Required

Technically, you would not need any <uses-feature> element in your manifest to indicate that you need a camera, as the CAMERA permission would handle that for you. However, it is good form to explicitly declare it anyway:

<uses-feature android:name="" android:required="true" />

Not only does that make your manifest more self-documenting, but it also helps protect you in case the default behavior of the CAMERA permission changes.

Other Camera Features

There are three other camera features that you could consider having <uses-feature> elements for:

  1., to indicate whether or not the device needs a camera with auto-focus capability.
  2., to indicate whether or not the device must support a camera flash
  3., to indicate whether or not the app needs a front-facing camera specifically ( requests a rear-facing camera)

Of these, the only one you should definitely include in your app is, once again because of the default effects of requesting the CAMERA permission. In particular, if you do not absolutely need auto-focus capabilities, you can use android:required="false" to reverse the CAMERA default requirement.

Finding Out What Cameras Exist

Some devices will have just a rear-facing camera. Some will have just a front-facing camera. Some will have both cameras. Some will have no cameras. And, in theory at least, some could have yet more camera options.

At some point, you are likely to need to find out what cameras exist on the device that you are running on. Perhaps you need a particular camera (e.g., a front-facing camera for your “selfie”-focused app). Or, perhaps you want to allow your users to switch between cameras on the fly.


The simplest way to choose a camera is to not choose at all, and arrange to open the default camera. That default camera is the first rear-facing camera on the device. However, devices that have no rear-facing cameras effectively have no default camera, and so going with the default is rarely the correct choice. Instead, you should iterate over the available cameras, to find the one that you want.

To find out how many cameras there are for the current device, you can call the static getNumberOfCameras() method on the Camera class.

To find out details about a particular camera, you can call the static getCameraInfo() method on Camera. This takes two parameters:

The most notable field on Camera.CameraInfo is facing, which tells you if this is a rear-facing (Camera.CameraInfo.CAMERA_FACING_BACK) or front-facing (Camera.CameraInfo.CAMERA_FACING_FRONT) camera.

For example, the following code snippet could be used to identify the first front-facing camera:

int chosen=-1;
int count=Camera.getNumberOfCameras();
Camera.CameraInfo info=new Camera.CameraInfo();

for (int cameraId=0; cameraId < count; cameraId++) {
  Camera.getCameraInfo(cameraId, info);

  if (info.facing==Camera.CameraInfo.CAMERA_FACING_FRONT) {

If chosen remains at a value of -1, you know that there is no front-facing camera available to you, and you would need to decide how you wish to proceed, if you really wanted such a camera.


With the original camera API, your main entry point is the Camera class. With the Android 5.0+ camera API, your main entry point is a CameraManager. This is another system service, one you can retrieve by calling getSystemService() on a Context, asking for the CAMERA_SERVICE:

CameraManager mgr=

You will notice here that we are specifically calling getSystemService() on the Application context. That is because there is a bug in Android 5.0 where CameraManager leaks the Context that creates it. This bug has been fixed in Android 5.1. However, to be safe, you are better off retrieving this system service via the singleton Application object, as there is no risk of a memory leak (singletons are “pre-leaked”, as it were).

Given a CameraManager, you can call getCameraIdList() to get a list of camera IDs. These are strings, not integers as they were with the original camera API.

To learn more about the camera, you can ask the CameraManager to give you a CameraCharacteristics object for a given camera ID. The CameraCharacteristics object has all sorts of information about the camera, including what direction it is facing. CameraCharacteristics behaves a lot like a HashMap, in that you use get() and a key to retrieve a value, such as CameraCharacteristics.LENS_FACING to determine the camera’s facing direction.

So, the code snippet for the first front-facing camera using a CameraManager named mgr, would be something like:

String chosen=null;

for (String cameraId : mgr.getCameraIdList()) {
  CameraCharacteristics cc=mgr.getCameraCharacteristics(cameraId);

  if (cc.get(CameraCharacteristics.LENS_FACING)==CameraCharacteristics.LENS_FACING_FRONT) {

Here, a value of null would indicate that there is no available front-facing camera.

Opening and Closing a Camera

Once you decide which camera you wish to use, you will eventually need to “open” it. This gives your app access to that camera, and blocks other app’s access while you have it open. You need to open a camera before you can use that camera to take pictures, record video, etc.

Eventually, when you are done with the camera, you should close it, to allow other apps to have access to the camera again. If you fail to close it, until your process is terminated, the camera is inaccessible.


Old code samples would open the camera by calling a zero-parameter static open() method on the Camera class. This opens the default camera, and as noted above, this is rarely a good idea. However, it is your only option on API Level 8 and below, if you are still supporting such devices, as those devices only supported a single camera.

Instead, if you have the ID of the camera that you wish to open, call the one-parameter static open() method, passing in the ID of the camera.

Both flavors of open() return an instance of Camera, which you can hold onto in your activity or fragment that is working with the camera.

While you have access to this camera, no other process can. Hence, it is important to release the camera when you are no longer needing it. To release the camera, call release() on your Camera instance, after which it is no longer safe to use the camera. A common pattern is to open() the camera in onStart() or onResume() and release() it in onPause() or onStop(), so you tie up the camera only while you are in the foreground.


Opening and closing a camera is a lot more complicated with the Android 5.0+ camera API.

Partly, that complexity seems to be due to a threading limitation with CameraManager — while we want to do long tasks related to the camera on background threads, CameraManager itself is not free-threaded when it comes to opening and closing cameras. Hence, we need to use some form of thread synchronization to make sure that we are not trying to open and close cameras simultaneously.

Partly, that complexity is that the way that CameraManager deals with background operations is via a Handler tied to a HandlerThread. HandlerThread, as the name suggests, is a Thread which has all the associated bits to support a Handler. The main application thread itself is a HandlerThread (or, close enough), but we specifically want to use a background thread, so we do not tie up the main application thread. So, we need to create and manage our own HandlerThread and Handler.

So, the first thing you will need to do is set up a HandlerThread, such as in a data member of some class:

final private HandlerThread handlerThread=new HandlerThread(NAME,

Here, NAME is some string to identify this thread (used in places like the list of running threads in DDMS). The second parameter is the thread priority; in general, you want your own HandlerThread instances to have background priority.

Creating the HandlerThread instance does not actually start the thread, any more than creating a Thread object starts the thread. Instead, you need to call start() when you want the thread to begin working its message loop. Any time after this point, it is safe to create a Handler for that HandlerThread, by getting the Looper from the HandlerThread and passing it to the Handler constructor:

handler=new Handler(handlerThread.getLooper());

(You might wonder why a class named HandlerThread, designed to work with a Handler, lacks any methods to give you such a Handler. Lots of people wonder this, so you are not alone.)

Next, to actually open the camera, you will need to call openCamera() on your CameraManager, supplying:

But, we want to make sure that we are not trying to open or close another camera while all of this is going on, so we need to use some sort of Java thread synchronization for that, such as a Semaphore:

final private Semaphore lock=new Semaphore(1);

Then, we can consider opening the camera, once we obtain the lock:

if (!lock.tryAcquire(2500, TimeUnit.MILLISECONDS)) {
  throw new RuntimeException("Time out waiting to lock camera opening.");

mgr.openCamera(cameraId, new DeviceCallback(), handler);

You will notice that we do not release the lock here, as we need to keep the lock until the camera has completed opening.

CameraDevice.StateCallback is an abstract class, so we usually have to create some dedicated subclass for it. There are three abstract methods that we will need to implement: onOpened(), onError(), and onDisconnected(). Plus, we will typically want to implement onClosed(), even though there is a default implementation of this callback.

onOpened() will be called when the camera is open and is ours to use. We are passed a CameraDevice object representing our open camera, and it is our job to hold onto this device while we have the camera open. The big thing that we need to do in onOpened() is release that lock that we obtained when we tried opening the camera. This is also a fine time to consider starting to show camera previews to the user, and we will see how to do that in upcoming sections of the book.

onError() will be called if there is some serious error when trying to open or use the camera. We are passed an error code to indicate what sort of problem we encountered. It could be that the camera is already in use (ERROR_CAMERA_IN_USE), or that while the camera exists, we do not have access to it due to device policy (ERROR_CAMERA_DISABLED), or that there was a general problem with this specific camera (ERROR_CAMERA_DEVICE) or with the overall camera engine (ERROR_CAMERA_SERVICE).

onDisconnected() will be called if we no longer can use the camera, for reasons other than our closing it ourselves. We are supposed to close the CameraDevice, if we have one, as the camera is no longer usable.

To close the camera, whether in response to onDisconnected() or because you are simply done with the camera, call close() on the CameraDevice, inside of the lock:

try {
finally {

Note that close() is a synchronous call, and so we can release() our lock in a finally block.

Our CameraDevice.StateCallback will be called with onClosed(), to let us know that the close operation has completed.

Setting Up a Preview Surface

The camera preview is basically a stream of images, taken by the camera, usually at less than full resolution. Mostly, that stream is to be presented to the user on the screen, to help them “see what the camera sees”, so they can line up the right picture.

For presenting the preview stream to the user, there are two typical solutions: SurfaceView and TextureView.

SurfaceView for the Camera

SurfaceView is used as a raw canvas for displaying all sorts of graphics outside of the realm of your ordinary widgets. In this case, Android knows how to display a live look at what the camera sees on a SurfaceView, to serve as a preview pane. A SurfaceView is also used for video playback, and a variation of SurfaceView called GLSurfaceView is used for OpenGL animations.

That being said, SurfaceView is a subclass of View, and so it can be added to your UI the same as any other widget:

If your app will support API Level 10 and older, you will want to call getSurfaceHolder().getType(SurfaceHolder.SURFACE_TYPE_PUSH_BUFFERS) on the SurfaceView. A “push buffers” SurfaceView is one designed to have images pushed to the surface, usually from video playback or camera previews. A SurfaceHolder is a quasi-controller object for the SurfaceView — most interactions with the SurfaceView come by way of the SurfaceHolder. This bit of configuration is not needed on API Level 11 and higher, as Android handles it for us automatically as the SurfaceView is put to use.

TextureView for the Camera

SurfaceView, however, has some limitations. This is mostly tied back to the way it works, by “punching a hole” in the UI to allow some lower-level component (like the camera) to render stuff into it. While there is a transparent layer on top of this “hole”, for use in alpha-compositing in any overlapping widgets, the SurfaceView content is not rendered as part of the normal view hierarchy. The net effect is that you cannot readily move, animate, or otherwise transform a SurfaceView.

TextureView was added in API Level 14 and works for camera previews as of API Level 15. TextureView serves much the same role as does SurfaceView, for showing camera previews, playing videos, or rendering OpenGL scenes. However, TextureView behaves as a regular View and so therefore can be animated and such without issue.

However, the cost is in performance. TextureView relies upon the GPU to do more work, and therefore TextureView is a bit less performant than is a SurfaceView. Most camera apps will not show a difference.

Showing the Previews

To show previews, you need to create your surface (SurfaceView or TextureView) and have it be part of your UI. Then, you can teach your opened camera to show previews on that surface.


The biggest thing that we need to do in the original camera API to configure the preview is determine what size of preview images should be used. Devices cannot support arbitrary-sized previews. Instead, we need to ask the camera what preview sizes it supports, choose one, then configure the camera to use that specific preview size.

To do any of this, we need the Camera.Parameters associated with our chosen and open Camera. Camera.Parameters serves two roles:

Getting the Camera.Parameters object from a Camera is a simple matter of calling getParameters().

To find out what the valid preview sizes are, we can call getSupportedPreviewSizes() on the Camera.Parameters object. This will return a List of Camera.Size objects, with each Camera.Size holding a width and a height as integers.

Choosing a preview size is a bit of an art form. Too big of a preview size is wasteful from a performance standpoint. Too small of a preview size results in a grainy preview. And, as will be seen later in this chapter, the difference in aspect ratio between your surface and your preview size will need to be taken into account. We will explore choosing preview sizes a bit more later in this chapter. For the moment, assume that we have sifted through the available preview sizes and have chosen something suitable. Whatever size you choose, you can pass to setPreviewSize() on the Camera.Parameters.

Then, you can call setParameters() on the Camera, passing in your modified Camera.Parameters object, to affect this change.

You will wind up with a block of code resembling:

Camera.Parameters parameters=camera.getParameters();
Camera.Size previewSize=chooseSomePreviewSize(parameters.getSupportedPreviewSizes());

parameters.setPreviewSize(previewSize.width, previewSize.height);


(where chooseSomePreviewSize() is a method of your own design)

Given that, in principle, there are just three more steps:

  1. Attach your preview surface to the Camera by calling setPreviewDisplay() (if you are using a SurfaceView) or setPreviewTexture() (if you are using a SurfaceTexture)
  2. Show the preview on-screen by calling startPreview() on the Camera
  3. Stop showing the preview by calling stopPreview() on the Camera

However, timing is important.

You also cannot call setPreviewDisplay() or startPreview() before your preview surface is ready. To know when that is, you will need to register a listener with your surface:

You also need to stop the preview before you release() the Camera. And, as we will see later in this chapter, you also need to restart your preview after taking a photo.


Once the camera is opened — even right from within the onOpened() method of your CameraDevice.StateCallback — you can request to have preview frames be pushed to your desired preview surface.

First, strangely enough, you are going to need to choose the resolution of the picture that you wish to take. You might think that this would be delayed until a later point, such as when we actually go to take a picture, but the API seems to want it right away.

To find out the possible resolutions, you need to request a StreamConfigurationMap from the CameraCharacteristics:

CameraCharacteristics cc=mgr.getCameraCharacteristics(cameraId);
StreamConfigurationMap map=

(where cameraId is the ID of the camera that you are working with)

From there, you can get an array of Size objects via a call to getOutputSizes(). Curiously, getOutputSizes() takes a Java class object, identifying the use case for the frames to be generated by the camera. So, passing SurfaceTexture.class would give you preview frame resolutions, but passing ImageFormat.JPEG would give you picture resolutions (at least, for images to be encoded in JPEG format).

So, you can get your roster of available picture sizes via:

CameraCharacteristics cc=mgr.getCameraCharacteristics(cameraId);
StreamConfigurationMap map=
Size[] rawSizes=map.getOutputSizes(ImageFormat.JPEG);

From there, you will need to choose a size. This process can be a bit interesting; some notes about it appear later in this chapter. But, for example, you might choose the size that is the highest resolution, as determined by the total area (width times height).

Next, you are going to need to set up an ImageReader. Typically this is done via the newInstance() factory method, which takes four parameters:

ImageReader reader=ImageReader.newInstance(pictureSize.getWidth(),
          pictureSize.getHeight(), pictureFormat, 2);

Then, you need a Surface associated with your preview surface. For example, you can call getSurfaceTexture() on a TextureView to get a SurfaceTexture, then pass it to the Surface constructor to get the associated Surface object.

Next, you can call createCaptureSession() on the CameraDevice representing the opened camera. This takes three parameters:

  .createCaptureSession(Arrays.asList(surface, reader.getSurface()),
                        new PreviewCaptureSession(), handler);

(where PreviewCaptureSession is some subclass of CameraCaptureSession.StateCallback)

That actually does not begin the previews. Instead, it configures the camera to indicate that it is possible to do previews.

To continue the work for getting the previews rolling, in the onConfigured() callback method on your CameraCaptureSession.StateCallback, you can create a CaptureRequest.Builder that you can use for configuring the camera to capture preview frames. You get one of those by calling createCaptureRequest() on the CameraDevice, passing in an int indicating the general type of request that you are creating, such as TEMPLATE_PREVIEW for preview frames:

CaptureRequest.Builder b=

You then call addTarget() on the Builder, supplying the Surface onto which the captured frames will be written. For previews, that target is the Surface associated with your preview surface.

You can also call set() on the Builder to configure various options that you would like for the camera, such as auto-focus modes, flash modes, and the like. The code snippet shown below demonstrates setting up “continuous picture” auto-focus mode and having the auto-exposure mode engage the flash as needed.

Eventually, you ask the CaptureRequest.Builder to build() you a CaptureRequest, and you pass that to setRepeatingRequest() on the CameraCaptureSession that is passed into onConfigure() of your CameraCaptureSession.StateCallback:

public void onConfigured(CameraCaptureSession session) {
  try {
    CaptureRequest.Builder b=
    // other Builder configuration goes here


    session.setRepeatingRequest(previewRequest, null, handler);
  catch (CameraAccessException e) {
    // do something
  catch (IllegalStateException e) {
    // do something

setRepeatingRequest() takes three parameters:

Note that you will want to hold onto the CaptureRequest.Builder that you created here, as you will want it again when it comes time to take a picture.

When you go to close() the CameraDevice, before you do so, you must also close up the previews. You do this by calling close() on the CameraCaptureSession and close() on your ImageReader.

Taking a Picture

At some point, you will want to take a picture. Typically, this is based on user input, though it would not have to be. Taking a picture not only involves telling the camera to capture a picture (typically at a different resolution than the previews), but also to arrange to get that written out to disk somewhere as a JPEG file.


Taking a photo with a Camera is a matter of calling takePicture() on the Camera object. There are two flavors of takePicture(), for which three parameters are in common:

The four-parameter version of takePicture() also takes a third Camera.PictureCallback, to be called when “a scaled, fully processed postview image is available”. This explanation probably means something to somebody, but the author of this book has no idea what it means.

You cannot call takePicture() until after startPreview() has been called to set up a preview pane. takePicture() will automatically stop the preview. At some point, if you want to be able to take another photo, you will need to call startPreview() again. Note, though, that you cannot call startPreview() until after the final compressed photo has been delivered to your Camera.PictureCallback object.

Before you call takePicture(), you are going to want to adjust the Camera.Parameters to configure how the photo should be taken. The primary setting to adjust is the size of the picture to take. Just as you ask Camera.Parameters for available preview sizes and choose one, you can call getSupportedPictureSizes(), which returns a List of Camera.Size objects. You can then choose a size and pass its width and height to setPictureSize() on the Camera.Parameters. Other things to potentially adjust include:

Note that calling setParameters() multiple times seems to lead to camera instability. Ideally, you collect all your desired settings from the user up front, then call setParameters() once when you set up your preview size. If you need to change parameters, you may wish to consider closing and re-opening the camera.

The Camera.PictureCallback will be called with onPictureTaken() and will be handed a byte array representing the picture. Typically, you will supply a PictureCallback for JPEG images, and so the byte array will represent the photo encoded in JPEG. At this point, you can hand that byte array off to a background thread to write it to disk, upload it to some server, or whatever else you planned to do with the picture.

Note that one thing you cannot readily do with the picture is hand it to another activity. There is a 1MB limit on the size of an Intent used with startActivity(), and usually the JPEG will be bigger than that. Hence, you cannot readily pass the picture via an Intent extra to another activity. If at all possible, use fragments or something else to keep all your relevant bits of UI together in a single activity, rather than try to get the images from activity to activity.


First, you should attach an ImageReader.OnImageAvailableListener instance to your ImageReader, using setOnImageAvailableListener(). ImageReader.OnImageAvailableListener is an interface; you will be called with onImageAvailable() when a new image is delivered to the ImageReader. We will come back to that onImageAvailable() method after quite a bit of additional coding.

Next, given the CaptureRequest.Builder you created when you set up the previews, you need to adjust the builder to lock the auto-focus (assuming that auto-focus is enabled):


At that point, you can build() a fresh CaptureRequest and call setRepeatingRequest() on the CameraCaptureSession, to change the previews to switch to a locked focus:

              new RequestCaptureTransaction(),

Here, RequestCaptureTransaction is a subclass of CameraCaptureSession.CaptureCallback, so you can be notified of how the auto-focus locking is proceeding. You wind up having to implement a fairly convoluted state machine to eventually find out it is time to take a picture… or possibly to ask for a “precapture trigger” to start on the auto-exposure system:

private class RequestCaptureTransaction extends CameraCaptureSession.CaptureCallback {
  private final Session s;
  boolean isWaitingForFocus=true;
  boolean isWaitingForPrecapture=false;
  boolean haveWeStartedCapture=false;

  RequestCaptureTransaction(CameraSession session) {

  public void onCaptureProgressed(CameraCaptureSession session,
                                  CaptureRequest request, CaptureResult partialResult) {

  public void onCaptureFailed(CameraCaptureSession session, CaptureRequest request, CaptureFailure failure) {
    // TODO: raise event

  public void onCaptureCompleted(CameraCaptureSession session, CaptureRequest request, TotalCaptureResult result) {

  private void capture(CaptureResult result) {
    if (isWaitingForFocus) {

      int autoFocusState=result.get(CaptureResult.CONTROL_AF_STATE);

      if (CaptureResult.CONTROL_AF_STATE_FOCUSED_LOCKED == autoFocusState ||
          CaptureResult.CONTROL_AF_STATE_NOT_FOCUSED_LOCKED == autoFocusState) {
        Integer state=result.get(CaptureResult.CONTROL_AE_STATE);

        if (state == null ||
            state == CaptureResult.CONTROL_AE_STATE_CONVERGED) {
        else {
    else if (isWaitingForPrecapture) {
      Integer state=result.get(CaptureResult.CONTROL_AE_STATE);

      if (state == null ||
          state == CaptureResult.CONTROL_AE_STATE_PRECAPTURE ||
          state == CaptureRequest.CONTROL_AE_STATE_FLASH_REQUIRED) {
    else if (!haveWeStartedCapture) {
      Integer state=result.get(CaptureResult.CONTROL_AE_STATE);

      if (state == null ||
          state != CaptureResult.CONTROL_AE_STATE_PRECAPTURE) {

  private void precapture() {
    try {
      s.captureSession.capture(, this, handler);
    catch (Exception e) {
      // do something

  private void capture() {
    try {
      CaptureRequest.Builder captureBuilder=


          new CapturePictureTransaction(), null);
    catch (Exception e) {
      // do something

The author of this book wishes he understood what all this stuff is for.

But, eventually, it will be time to take the picture, represented by the capture() method in the above code dump. Here, we create a new CaptureRequest.Builder, this time using TEMPLATE_STILL_CAPTURE to indicate that we are trying to take a picture. We set up our target (via addTarget()) to be the Surface from the ImageReader. We re-establish our desired auto-focus and auto-exposure modes. Then, we stop the previews, by calling stopRepeating() on the CameraCaptureSession, undoing the prior setRepeatingRequest() call where we asked for previews. Then, we call capture() on the CameraCaptureSession, requesting a single-frame capture rather than a repeating request. This, like setRepeatingRequest(), takes our CaptureRequest from the Builder, a CameraCaptureSession.CaptureCallback to find out the results of the capture work, and our Handler.

The primary job of this CameraCaptureSession.CaptureCallback is to restart the previews, in onCaptureCompleted(). First, we use the preview edition of the CaptureRequest.Builder to undo some of the changes made during the camera capture process. Then, given the original preview CaptureRequest, we call setRepeatingRequest() again, to get the previews showing once more:

public void onCaptureCompleted(CameraCaptureSession session, CaptureRequest request, TotalCaptureResult result) {
  try {
    s.captureSession.capture(, null, handler);
    s.captureSession.setRepeatingRequest(previewRequest, null, handler);
  catch (CameraAccessException e) {
    // do something
  catch (IllegalStateException e) {
    // do something

As part of all of this work, your onImageAvailable() method on your ImageReader.OnImageAvailableListener will be called when the picture is ready. The recipe for getting your JPEG image looks like this:

public void onImageAvailable(ImageReader imageReader) {
  Image image=imageReader.acquireNextImage();
  ByteBuffer buffer=image.getPlanes()[0].getBuffer();
  byte[] bytes=new byte[buffer.remaining()];


  // do something with the byte[] of JPEG data

Here, you are subject to the same sorts of limitations as were described in the section on taking pictures with the original camera API. Notably, that byte array may be large, too large to put into an Intent extra and pass to another activity.

Recording a Video

Traditional Android video recording is handled via MediaRecorder. This means that we need to hand control over the camera from the regular camera API that we are using to MediaRecorder, record the video, and then return control back to the camera API (e.g., for previews).

MediaRecorder itself then has its own API for configuring the recorder, starting the recording, and stopping the recording.


To retain the camera access for your app, but allow MediaRecorder to take over the camera, call stopPreview(), then unlock(), on the Camera object:


When the recording is complete, you reverse the process, by calling reconnect() and startPreview():


In between the unlock() and reconnect() calls is when you use the MediaRecorder API.


This particular combination (video recording with the Android 5.0+ camera API) will be covered in a future edition of this chapter.

Using MediaRecorder

Creating a MediaRecorder instance is simple enough: just use the zero-argument constructor.

You then need to tell it what camera to use. With the original camera API, that is a matter of calling setCamera() on the MediaRecorder, passing in your Camera object.

MediaRecorder recorder=new MediaRecorder();


Next, call setAudioSource() and setVideoSource() to indicate where the audio and video to be recorded are coming from. The typical value to use for the audio source is CAMCORDER. For the original camera API, you will need to use CAMERA as the video source:


Next, you need to configure how the video should be recorded, in terms of things like resolution. The typical approach using the original camera API is to use setProfile(), passing in a CamcorderProfile to the MediaRecorder. You can find out what profiles are supported by calling methods like hasProfile() on CamcorderProfile. There are some fairly generic profiles, like QUALITY_HIGH and QUALITY_LOW, and some fairly specific profiles, like QUALITY_2160P for 2K video. Not all devices will support all profiles, based on Android version and camera driver capabilities. So, you will need to be responsive to varying cameras and gracefully degrade from the profile you want to a profile that you can get. For example, the following code snippet tries QUALITY_HIGH, falls back to QUALITY_LOW if QUALITY_HIGH is not available, and bails out if neither of those profiles exist:

boolean canGoHigh=CamcorderProfile.hasProfile(cameraId,
boolean canGoLow=CamcorderProfile.hasProfile(cameraId,

if (canGoHigh) {
else if (canGoLow) {
else {
  throw new IllegalStateException(
      "cannot find valid CamcorderProfile");

Here, cameraId is the int identifying your open camera.

Then, you can configure:

  setOutputFile(new File(getExternalFilesDir(null), FILENAME).getAbsolutePath());
recorder.setMaxFileSize(5000000); // ~5MB max
recorder.setMaxDuration(10000); // ~10 seconds max
recorder.setOrientationHint(90); // rotate output 90 degrees

Optionally, you can call setInfoListener() and setErrorListener(), supplying objects that will be invoked when certain events occur. Notably, if you use setMaxFileSize() or setMaxDuration(), the OnInfoListener object will be notified when recording automatically stops due to reaching one of those limits.

You then call prepare(), followed by start(), and your video recording will commence:


When it comes time to stop the recording manually (e.g., user taps a “stop” button), just call stop(), then release(), on the MediaRecorder.

Configuring the Still Camera

In general, when using the camera classes in Android, you get reasonable defaults for things like focus mode and flash mode. However, what might be reasonable defaults may not be what the user wants in any given circumstance. Other bits of configuration, like zoom, cannot really be defaulted (other than to “no zoom”).

For these, you will need to provide some sort of UI to allow the user to request settings, then apply them as part of your camera implementation. Here, we will focus on applying the configuration.

(and, yes, that was a pun)

Focus Mode

Frequently, a user will want simple autofocus behavior, where the camera attempts to focus on the content centered within the preview. However, in some situations, the user may want autofocus to be disabled, turning the camera into a fixed-focus camera. And there are some specialty focus modes that may be available to you as well, depending upon device and camera API.

Here is how you can set up the camera to use one of those focus modes, for each of the camera APIs.


The Camera.Parameters object has a getSupportedFocusModes() method. This returns a List of String objects, where each value corresponds to a focus mode that is available on this camera (front-facing, rear-facing) on this device. The possible strings are defined as constants on Camera.Parameters:

In truth, few devices support all of these. However, every device will support at least one; getSupportedFocusModes() is guaranteed to not return null and not return an empty List.

To choose a focus mode, call setFocusMode() on the Camera.Parameters, supplying the string of the desired mode. And, of course, you will eventually need to call setParameters() on the Camera, supplying your modified Camera.Parameters.


Similarly, you can get a list of supported auto-focus modes by calling get(CameraCharacteristics.CONTROL_AF_AVAILABLE_MODES) on a CameraCharacteristics object tied to your chosen camera. This returns an array of int values, instead of a List of strings. The possible values are defined as constants on CameraMetadata:

After the user chooses a value, you will need to call set(CaptureRequest.CONTROL_AF_MODE, ...) on your CaptureRequest.Builder, where ... is the int of the desired focus mode. Note that you will need to do this both for the CaptureRequest.Builder for preview frames and for the CaptureRequest.Builder used when you take an actual picture. If the user is changing this value while you are already showing the preview, you will need to update the preview behavior, by calling build() on the Builder to create the CameraRequest, then calling setRepeatingRequest() to override your previous CameraRequest with the new one with the new focus mode. As a result, you tend to want to hang onto your CameraRequest.Builder for previews, so you can make these sorts of incremental changes in behavior, without having to create a fresh Builder from scratch with all of the desired settings.

Flash Mode

Typically, users want flash when they need flash, due to insufficient ambient lighting. However, once again, they may want specific flash modes instead (definitely flash, definitely not flash, etc.).

As with focus modes, you can ask the camera APIs what flash modes are available for a given camera. In this case, though, there is no guarantee of any flash mode configurability, since not all cameras have flash (and Android considers “off” and “flash does not exist” to be different things). And, once the user has chosen a flash mode, you can configure the camera APIs to use that particular mode.

Of course, the details vary by camera API.


Camera.Parameters has getSupportedFlashModes(), which returns a List of strings representing the supported flash modes, or null if flash modes cannot be configured for this camera. The string values map to constants defined on Camera.Parameters:

There is an additional flash mode, FLASH_MODE_TORCH, that will keep the flash during the preview as well as flashing it during the actual act of taking the picture. In truth, this setting is more often used for flashlight apps.

Once the user has chosen a flash mode, you can call setFlashMode() on the Camera.Parameters, then eventually call setParameters() on the Camera.


To find out what flash modes are available for a camera2 camera, you can call get(CameraCharacteristics.CONTROL_AE_AVAILABLE_MODES) on the CameraCharacteristics for the camera in question. This returns an array of int values, mapping to constants defined on CameraCharacteristics:

Here, AE is short for “auto-exposure”. CONTROL_AE_MODE_ON says that auto-exposure is enabled, just without any flash. There is a separate CONTROL_AE_MODE_OFF which totally disables the auto-exposure capability. However, that will screw up auto-focus and auto-white balance, and so rarely will camera apps want to use CONTROL_AE_MODE_OFF.

Once the user chooses the desired flash mode, you can call set(CaptureRequest.CONTROL_AE_MODE, ...) on your CaptureRequest.Builder object, where ... is the desired flash mode int. You will need to do this both for the preview Builder and the Builder used when actually taking the picture. If the user is changing this value on the fly, you will need to update the preview behavior, by calling build() on the Builder to create the CameraRequest, then calling setRepeatingRequest() to override your previous CameraRequest with the new flash-enabled one.


Flash and focus modes might be the sort of thing that the user could choose before you start up your camera preview, let alone take a picture. Zoom, on the other hand, is the sort of thing that the user will want to adjust on the fly, based on what they see in the preview.

Hence, your first challenge with implementing a zoom feature is deciding how you want users to indicate that they want to zoom in or out, given that probably most of your screen space is taken up by the preview itself. Options include:

Both Android camera APIs have the notion of a numeric zoom level. The bottom end of the zoom range is either 0 (for the classic camera API) or 1 (for the camera2 API). The top end is found by from the camera APIs. Your job will be to convert whatever input signals you get from the user into a zoom level, then update the camera settings to zoom to that setting.

The code segments shown in this section assume that your input is giving you a zoom level in the 0-100 range, such as via a SeekBar with the default maximum value.


Camera.Parameters offers several methods related to zoom.

The big one is isZoomSupported(). false means that the camera does not offer any sort of zoom (digital or optical). You might use that to disable your zoom input option, so as not to offer something to the user that will not work. Few devices will return false, though.

Assuming isZoomSupported() is true, then getMaxZoom() will tell you the highest possible zoom value. Your overall range of zoom values will be from 0 to this maximum.

If you are using some form of user input that only indicates incremental changes in zoom (e.g., buttons for zoom in and zoom out), you can use getZoom() to find out the current zoom value. You can then increment or decrement that value and check your new value against the ends of the range (0 and getMaxZoom()) to ensure that it is valid.

Given a new zoom value, you have two choices for applying it:

The following code snippet takes a zoom level from 0 to 100 and zooms the camera, assuming zoom is supported:

public boolean zoomTo(Camera camera, int zoomLevel) {
  Camera camera=descriptor.getCamera();
  Camera.Parameters params=camera.getParameters();
  int zoom=zoomLevel*params.getMaxZoom()/100;
  boolean result=false;

  if (params.isSmoothZoomSupported()) {
  else if (params.isZoomSupported()) {


You will notice that if isSmoothZoomSupported() returns true, we not only call startSmoothZoom(), but we also call setZoomChangeListener(). This registers a listener to find out about how the smooth zoom is progressing. In particular, you should disable further changes to the zoom until the smooth zoom process completes. Your OnZoomChangeListener will be called with onZoomChange() for each incremental change in the zoom from start to finish, with stopped set to true when we are done with the smooth zoom operation:

public void onZoomChange(int zoomValue, boolean stopped,
                         Camera camera) {
  if (stopped) {
    // do something

If you need to stop the smooth zoom before completion, there is a stopSmoothZoom() method on Camera that you can call. For example, instead of disabling zoom controls, you might stop the current smooth zoom operation if the user chooses a new zoom level, then start a fresh smooth zoom operation to the newly-requested level.


(the author would like to thank Daniel Albert for helping with this section)

On the surface, the camera2 API works much the same: you find out the maximum zoom value, translate your user input into the valid zoom value range (this time, from 1.0f to the maximum), and then update the camera for that zoom value.

However, that last step is substantially different than before.

For digital zoom, rather than saying “zoom in to this value”, we say “crop the camera inputs to this rectangle, and expand that rectangle to fill the preview or the picture”. This is rather more complex, albeit with potentially more power.

To find out the maximum digital zoom value, call get(CameraCharacteristics.SCALER_AVAILABLE_MAX_DIGITAL_ZOOM) on the CameraCharacteristics for the camera in question. That will be a float value. 1.0f would indicate that the camera cannot perform digital zoom. The range of possible digital zoom values is from 1.0f to whatever the maximum is.

So, the first part of this edition of zoomTo() normalizes a 0-100 integer into a float representing the zoom value:

public boolean zoomTo(String cameraId,
                      CaptureRequest.Builder previewRequestBuilder,
                      CameraCaptureSession captureSession,
                      int zoomLevel) {
  try {
    final CameraCharacteristics cc=
    final float maxZoom=

    // if <=1, zoom not possible, so eat the event
    if (maxZoom>1.0f) {
      float zoomTo=1.0f+((float)zoomLevel*(maxZoom-1.0f)/100.0f);
      zoomRect=cropRegionForZoom(cc, zoomTo);

        .set(CaptureRequest.SCALER_CROP_REGION, zoomRect);;
        null, handler);
  catch (CameraAccessException e) {
    // ummm... do something


Given a zoom value, we need to determine the Rect that represents the subset of the field of vision that we want to zoom into. The following algorithm zooms into the center of the field:

private static Rect cropRegionForZoom(CameraCharacteristics cc,
                                      float zoomTo) {
  Rect sensor=
  int sensorCenterX=sensor.width()/2;
  int sensorCenterY=sensor.height()/2;
  int deltaX=(int)(0.5f*sensor.width()/zoomTo);
  int deltaY=(int)(0.5f*sensor.height()/zoomTo);

  return(new Rect(

That Rect then gets used:

Note that this does not cover optical zoom. On Android 5.0, that is handled as available focal lengths. You can get the list of available focal lengths by requesting LENS_INFO_AVAILABLE_FOCAL_LENGTHS from the CameraCharacteristics. Setting LENS_FOCAL_LENGTH on a CaptureRequest.Builder will shift the camera’s focal length as requested. This may take a moment, as optical zoom usually requires mechanical changes in the camera configuration. The LENS_STATE (on CaptureResult) will be reported as MOVING while the focal length is changing, or STATIONARY once the focal length has reached the requested value.

And Now, The Problems

Of course, taking pictures is not nearly this simple. The preceding sections glossed over all sorts of problems that you will run into in practice when trying to implement these APIs. The following sections outline a few of those problems, particularly ones that will affect both camera APIs.

Choosing a Preview Size

Camera drivers are capable of delivering preview images to your preview surface in one of several resolutions. You have to sift through a roster of resolutions and choose one.

Your gut instinct might be to choose the highest-available resolution. After all, that should result in the highest-quality previews. However, this can be wasteful, if the preview images are significantly bigger than your preview surface. Plus, the larger the preview frames, the slower the camera driver will be to deliver them, reducing your possible frames-per-second (fps) for the previews. You might instead elect to choose the largest preview that is smaller than the surface, or some algorithm like that.

Previews and Aspect Ratios

Compounding the problem of choosing preview sizes is that the resolutions of available preview sizes bear no relationship at all to the size of your preview surface. After all, you might have a TextureView that fills the screen, or you might have a TextureView that is rather tiny. That is up to you from a UI design standpoint; the camera driver is oblivious to such considerations.

In particular, the aspect ratios (width divided by height) of the preview frames do not necessarily have to match the aspect ratio of your preview surface. For example, few camera drivers support square previews, yet for aesthetic reasons you might be aiming for a square preview surface.

You have two main approaches for dealing with this: letterboxing and cropping.

Letterboxing is where your preview frames retain their aspect ratio, but do not fill up all the available space in the preview surface. Instead, part of the preview surface is unused. For example, if your preview surface is square, and your preview frames have a landscape aspect ratio (width is greater than the height), letterboxing would show the landscape aspect ratio within the square box of the preview surface, with black bars for the unused portion of the square’s height. Typically, using gravity, you try to have the preview frames be centered and the unused portion of the surface be split to either side of the frames.

If you want to fill the preview surface, then letterboxing is not a viable option. However, if you just take the preview frames and try to put them into the surface, the surface will stretch the frames to fit the surface. If the aspect ratio of the frames is significantly different than is the aspect ratio of the surface, the subject matter in the preview will seem significantly stretched, either vertically or horizontally.

The trick to deal with this, on API Level 14+ (with graphics acceleration enabled, as is the default), is to have the surface be bigger than what you really want, but then to have something overlapping the surface and causing it to be visually cropped. You have your new, larger surface match the aspect ratio of the preview frames, so there is no stretching. However, now what the user sees in your preview surface may differ substantially from what winds up in the picture or video, as you are cropping off portions that do not fit your preview surface, where those cropped areas might well show up in final output.

Choosing a Picture or Video Size

Choosing a picture or video size is reminiscent of choosing a preview size. While many cases will call for as high of a resolution as you can muster, some use cases will lead you towards choosing a lower resolution. For example, situations requiring a rapid upload of the resulting media might select a lower resolution, as that will reduce the file size and make the upload process that much faster.

Also, bear in mind that the aspect ratio of the available picture or video sizes do not necessarily match the aspect ratio of either the preview frames or your preview surface. Emphasize to your users that the preview surface is for aiming the camera; what actually gets recorded may be somewhat different in scope but should be centered on the same spot.

Picture Orientation

Your app may wish to take pictures in both landscape and portrait modes. However, the camera drivers are designed around taking pictures in landscape, particularly for rear-facing cameras.

You can hint to the camera driver what orientation you think the resulting picture should have, such as via setRotation() on the Camera.Parameters in the original camera API. However, as the documentation for that method states:

The camera driver may set orientation in the EXIF header without rotating the picture. Or the driver may rotate the picture and the EXIF thumbnail. If the Jpeg picture is rotated, the orientation in the EXIF header will be missing or 1 (row #0 is top and column #0 is left side).

Many camera drivers take the approach of leaving the image alone and setting the Orientation EXIF header. That header tells image viewers to rotate the image. Unfortunately, not all image viewers or image decoding libraries pay attention to this. Notably, Android usually does not pay attention to this, as BitmapFactory ignores this EXIF header. As a result, when you go to load in your own picture that you took, your result may come out mis-oriented.

You have two major choices:

  1. Put more smarts in any logic that you are using to display images that you take with the camera, where you read the EXIF headers yourself and you arrange to rotate the image as needed, perhaps by rotating the ImageView you are using to show the image.
  2. As part of post-processing the image before saving it, you rotate the image based upon what is in the EXIF header, and save the image with the proper rotation and no EXIF header. This has the advantage of making the image “correct” for all image viewers. However, rotating full-resolution photos is rather memory-intensive and slow. Using NDK code, such as this library, may be able to help.

Storage Considerations

Bear in mind that if you wish to save pictures or videos in common locations on external storage, such as the standard location for digital camera output (Environment.DIRECTORY_DCIM), you will need the WRITE_EXTERNAL_STORAGE permission on all relevant API levels. As of Android M, this is a dangerous permission handled via the runtime permission system, so you will need to have the <uses-permission> element in the manifest and ask the user for that permission at runtime.

Also, files written out to external storage will not be picked up immediately by MediaStore, and so “gallery” and related apps that rely upon the MediaStore will not see your pictures or videos. You can use MediaScannerConnection to proactively have the MediaStore add your newly-created files to the index, as was covered earlier in the book.

Configuration Changes

Opening and closing a camera each takes a fair amount of time. As a result, if your app wants to support taking pictures and videos in either portrait or landscape, this is a case where you will want to strongly consider using a retained fragment to hold onto your Camera (or combination of CameraManager and CameraDevice) across a configuration change. That way, Android will not destroy and recreate the fragment, and you can keep the camera open during the change.

Camera Peeking Attacks

(NOTE: this section is based upon a blog post from the author)

A research paper points out an interesting Android attack vector, resulting in a possible leak of private information. The paper’s authors refer to it as the “camera peeking” attack.

An Android camera driver can only be used by one app at a time. The attack is simple:

The example cited by the paper’s authors is to watch for a banking app taking a photo of a check, to try to take another photo of the check to send to those who might use the information for various types of fraud.

Polling for camera availability is slow, simply because the primary way to see if the camera is available is to open it, and that takes hundreds of milliseconds. The paper’s specific technique helped to minimize the polling, by knowing when the right activity was in the foreground and therefore the camera was probably already in use. Then, it would be a matter of polling until the camera is available again and taking a picture. Even without the paper’s specific attack techniques, this general attack is possible, and there may be more efficient ways to see if the camera is in use.

On the other hand, the defense is simple: if your app is taking pictures, and those pictures may be of sensitive documents, ask the user to point the camera somewhere else before you release the camera. So long as you have exclusive control over the camera, nothing else can use it, including any attackers.

A sophisticated implementation of this might use image-recognition techniques to see, based upon preview frames plus the taken picture, if the camera is pointing somewhere else. For example, a banking app offering check-scanning might determine if the dominant color in the camera field significantly changes, as that would suggest that the camera is no longer pointed at a check, since checks are typically fairly monochromatic.

Or, just ask the user to point the camera somewhere else, then close the camera after some random number of seconds.

General-purpose camera apps might offer an “enhanced security” mode that does this sort of thing, but having that on by default might annoy the user trying to take pictures at the zoo, or at a sporting event. However, document-scanning apps might want to have this mode on by default, and check-scanning apps might simply always use this mode.