torsdag 7 juni 2012

A free course meal

I received an email yesterday, from Udacity, saying they're launching 5 new online courses. I took one of their courses this spring ("CS#373 Programming a Robotic Car"), and I think they're an excellent alternative if you're looking to broaden your knowledge (for free) and get a basic understanding on new topics. All it costs is your time (which is precious enough, i know!).
The new courses are: "Intro to Physics", "Intro to Statistics", "Algorithms", "Logic and Discrete Mathematics" and "Software Testing".

While I'm at it, I should mention a few other (free) online educational organizations, which I have personal experience from:

Coursera
I took their Machine Language course this winter, which was then just a prototype. Taught by Stanford professor Andrew Ng, it was a great experience, that gave me a lot of insight into this, for me, new area.

Coursera seems to have a wider span of science topics, whereas Udacity seems to focus more on software development.

I feel one can't mention online education, without also bringing up Khan Academy, the institution that pioneered it all, and also has gamified the experience the most. It presents (mostly) mathematical topics in a very personal and charismatic way, with ways to practice them for those who want to. A great way to freshen up your linear algebra for the other courses, perhaps?

lördag 2 juni 2012

Shi-Tomasi corner detection on Android

The following tutorial descibes how to implement a real time Shi Tomasi Corner Detection application in an Android environment using Open CV.
It's aimed at those new to either Open CV, or Android development, and was created as a way (for me, as a Java and Android developer) to get into the world of Open CV.

Why Corner Detection?
Corner detection is a base stepping stone, used in many computer vision applications, when recognizing features in images.

http://en.wikipedia.org/wiki/Corner_detection

In this case, it's a good first glimpse at Open CV, if you're new to it, since it requires little implementation. It's also good if you're familiar with Open CV, but new to android development, as it is a relatively simple android application as well.

I'd recommend you to familiarize yourself with the Android examples provided by Open CV, since you will probably recognize the code better, then. http://code.opencv.org/projects/opencv/wiki/OpenCV4Android

Basics:
The application structure is based on tutorial-2-opencvcamera Open CV Android example.
The actual implementation is based on this C++ tutorial:

http://opencv.itseez.com/doc/tutorials/features2d/trackingmotion/good_features_to_track/good_features_to_track.html#good-features-to-track
It is, however, uncommented. I'll try to explain each step, as we go.

CameraBaseActivity.java
This class is more or less identical with the one in the tutorial. If you're familiar with android activities, you can jump to the next class.

public class CameraBaseActivity extends Activity {
private static final String TAG = "Sample::Activity";

private CvViewBase mView;

public CameraBaseActivity() {
Log.i(TAG, "Instantiated new " + this.getClass());
}

@Override
public void onCreate(Bundle savedInstanceState) {
Log.i(TAG, "onCreate");
super.onCreate(savedInstanceState);
requestWindowFeature(Window.FEATURE_NO_TITLE);
getWindow().addFlags(WindowManager.LayoutParams.FLAG_FULLSCREEN | WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON);
mView = new ShiTomasiView(this);
setContentView(mView);
}

}

Nothing too surprising in this little class. The onCreate() method sets a few window settings, like removing the title bar, and making sure the screen stays on even if it's not touched for a certain amount of time.
It also creates an instance of our View, which will do most of the work.

ShiTomasiView is an extension of the CvViewBase class. Again, CvViewBase is more or less identical with the example class, so ifyou're familiar with that, you can skip this section.

CvViewBase.java
public abstract class CvViewBase extends SurfaceView implements SurfaceHolder.Callback, Runnable {
private static final String TAG = "Sample::SurfaceView";

private SurfaceHolder mHolder;
private VideoCapture mCamera;

public CvViewBase(Context context) {
super(context);
mHolder = getHolder();
mHolder.addCallback(this);
Log.i(TAG, "Instantiated new " + this.getClass());
}

Here we see the first reference to Open CV, a VideoCapture object. Properly named, it's used to grab the image from the device's camera. We also see that the class is a SurfaceView, which gives us something to draw on, and a Runnable.

public void surfaceCreated(SurfaceHolder holder) {
Log.i(TAG, "surfaceCreated");
mCamera = new VideoCapture(Highgui.CV_CAP_ANDROID);
if (mCamera.isOpened()) {
(new Thread(this)).start();
} else {
mCamera.release();
mCamera = null;
Log.e(TAG, "Failed to open native camera");
}
}

When the surface is first created, we instantiate a new VideoCapture, and let it know what kind of device it is (we're working with Android, here).

If all is well, we start the thread (this is a Runnable).

public void surfaceChanged(SurfaceHolder _holder, int format, int width, int height) {
Log.i(TAG, "surfaceCreated");
synchronized (this) {
if (mCamera != null && mCamera.isOpened()) {
Log.i(TAG, "before mCamera.getSupportedPreviewSizes()");
List<Size> sizes = mCamera.getSupportedPreviewSizes();
Log.i(TAG, "after mCamera.getSupportedPreviewSizes()");
int mFrameWidth = width;
int mFrameHeight = height;

// selecting optimal camera preview size
{
double minDiff = Double.MAX_VALUE;
for (Size size : sizes) {
if (Math.abs(size.height - height) < minDiff) {
mFrameWidth = (int) size.width;
mFrameHeight = (int) size.height;
minDiff = Math.abs(size.height - height);
}
}
}

mCamera.set(Highgui.CV_CAP_PROP_FRAME_WIDTH, mFrameWidth);
mCamera.set(Highgui.CV_CAP_PROP_FRAME_HEIGHT, mFrameHeight);
}
}
}

Every time the surface changes (like, when you flip from portrait to landscape) this method will be called. It compares the dimensions of the old and new views and adjust to the new conditions.

public void surfaceDestroyed(SurfaceHolder holder) {
Log.i(TAG, "surfaceDestroyed");
if (mCamera != null) {
synchronized (this) {
mCamera.release();
mCamera = null;
}
}
}

When the surface is no longer used, this method releases the VideoCapture resources.That was the basic surface handling methods. Now we come to the parts that actually do something.

protected abstract Bitmap processFrame(VideoCapture capture);

This abstract method is what does the application specific logic. It is called from the run() method, below.

public void run() {
Log.i(TAG, "Starting processing thread");
while (true) {
Bitmap bmp = null;

synchronized (this) {
if (mCamera == null)
break;

if (!mCamera.grab()) {
Log.e(TAG, "mCamera.grab() failed");
break;
}

bmp = processFrame(mCamera);
}

if (bmp != null) {
Canvas canvas = mHolder.lockCanvas();
if (canvas != null) {
canvas.drawBitmap(bmp, (canvas.getWidth() - bmp.getWidth()) / 2, (canvas.getHeight() - bmp.getHeight()) / 2, null);
mHolder.unlockCanvasAndPost(canvas);
}
bmp.recycle();
}
}

Log.i(TAG, "Finishing processing thread");
}

The main loop of the application. The Bitmap bmp is where we will draw the results of the processFrame() method. So, this is what the run() method does in general:

Create a new Bitmap.

Grab the image from the camera (this is an important step, which must be called prior to retrieve() )

Call processFrame(), which will do the manipulations.

Draw whatever comes back (if anything).

Repeat.

As you might have noticed, so far it doesn't do anything. In fact, it doesn't even work since it calls an abstract method that isn't implemented anywhere.
This brings us to the next class, ShiTomasiView.java. It's based on Sample2View.java in the example.

ShiTomasiView.java:
public class ShiTomasiView extends CvViewBase {

private Mat sceneColor;
private Mat sceneGrayScale;
private final static double qualityLevel = 0.35;
private final static double minDistance = 10;
private final static int blockSize = 8;
private final static boolean useHarrisDetector = false;
private final double k = 0.0;
private final static int maxCorners = 100;
private final static Scalar circleColor = new Scalar(0, 255, 0);

public ShiThomasiView(Context context) {
super(context);
}

First of all we declare a bunch of variables. Mat is Open CV's Matrix class. For starters, we're going to use 2 Mat's to store the camera image in, one in color (RGB), and one black and white.

The rest are values needed for the corner detection algorithm. You will find that you most likely have to tweak them to suit your needs. A good follow up excercise could be to let the application find out the best values by itself. Let's ignore them for now and i'll explain them when we need them.

public void surfaceChanged(SurfaceHolder _holder, int format, int width, int height){
super.surfaceChanged(_holder, format, width, height);

synchronized (this) {
sceneGrayScale = new Mat();
sceneColor = new Mat();
}
}

When the surface is created, we instantiate the two Mat's we're going to use. Don't forget to call super.

Now we come to the actual image processing method. Let's break it up a bit.

protected Bitmap processFrame(VideoCapture capture) {
capture.retrieve(sceneColor, Highgui.CV_CAP_ANDROID_COLOR_FRAME_RGB);

Imgproc.cvtColor(sceneColor, sceneGrayScale, Imgproc.COLOR_RGB2GRAY);

Remember how we called capture.grab() in the previous class? Now we will retrieve that frame from the camera, and put it in the sceneColor matrix.

We then use cvtColor to convert it to grayscale with the COLOR_RGB2GRAY constant, and thus put it in the sceneGrayScale Mat.

Why convert the image to grayscale?
In computer vision, it's common to do this. Afaik, for two reasons;
Features become more easily distinguishable since the contrast becomes clearer.
It improves performance and memory usage (a pixel is stored in 1 byte instead of 3)

MatOfPoint corners = new MatOfPoint();

Imgproc.goodFeaturesToTrack(sceneGrayScale,
corners,
maxCorners,
qualityLevel,
minDistance,
new Mat(),
blockSize,
useHarrisDetector,
k);

Now it's time to put the libraries to good use. The goodFeaturesToTrack does most of the work for us, and gives us back a list of corners (in the form of a 1D Mat). To do this, we need to supply it with some values. So let's go through the parameters.

sceneGrayScale is our image we want to detect corners in.
corners is our list of corners found by the algorithm.
maxCorners is the maximum number of corners we want it to return.
qualityLevel is the minimum ”quality level” of the results found for the result to be considered a corner.
minDistance is the minimum distance in pixels required from one corner to the next.
In the next one we can supply a mask Mat in case we want to focus on a certain area of the image.
blockSize is in how big an area in pixels, the algorithm will use to define corners.
The next boolean is whether we're going to use Harris Corner Detection or not. In this example we aren't.
We can ignore the k value, since it's only used in Harris Corner Detection.


Bitmap bmp = Bitmap.createBitmap(sceneGrayScale.cols(), sceneGrayScale.rows(), Bitmap.Config.RGB_565);
Point[] points = corners.toArray();
for (Point p : points) {
Core.circle(sceneColor, p, 5, circleColor);
}
try {
Utils.matToBitmap(sceneColor, bmp);
} catch (Exception e) {
Log.e(this.getClass().getSimpleName(), "Exception thrown: " + e.getMessage());
bmp.recycle();
}
return bmp;
}

We're now ready to draw our results and create a new bitmap, defining it as as an RGB image(no alpha).

To access the Point elements we convert it to an array.

Open CV has a bunch of image manipulation methods built in, which we're going to use. An option would be to draw everything on an Android Canvas.

Circle creates a ring on the supplied Mat, with a radius (5 on this occasion), and a color (green), which we defined as static in the beginning of this class.

matToBitmap converts the Mat to a Bitmap (obviously).

The bitmap is returned to the run() method in the CvViewBase, which draw it unto the Canvas.

That's it! (Almost).

public void run() {
super.run();

synchronized (this) {
// Explicitly deallocate Mats
if (sceneColor != null) {
sceneColor.release();
}
if (sceneGrayScale != null) {
sceneGrayScale.release();
}

sceneColor = null;
sceneGrayScale = null;
}
}

The class also overrides the run() method, and deallocates the resources in our materials (as per the example).

Congratulations! You now have an application that detects interest points in real time.

Stay tuned for more stuff.