Friday, December 18, 2009

Diver's going to Spain!

A paper based on some of the innovations in Diver has been accepted to the 2010 European Conference on Software Maintenance and Reengineering. I will be presenting the paper in Madrid in March.

A lot of technical and technological innovation does come from academia. You don't always see it directly in the commercial products right away, but I do believe that it has influence. Especially as software engineering and computer science graduates move into industry, bringing their research expertise with them. A lot of the things that we take for granted in our IDEs were first written up in academic conferences or journals. Take the interactive debugger that we all use all the time as an example [Isoda, 87]. I think that industry and academia really benefit from good research relationships.

Anyway, it turns out that the way that Diver reduces the size of trace visualizations (in the sequence diagram) by cross-linking the trace with the source code is actually a new idea. So, the nice people at CSMR decided to let me write about it and give a presentation.

The paper is called "Utilizing Debug Information to Compact Loops in Large Program Traces." (Sounds exciting doesn't it? ;-) ). Everyone knows that most of the execution of any program is really caused by an enormous amount of repetitive calls to a subset of the methods/functions/subs/whatever that are written in the code. Most of these calls occur within loops structures. That causes a big problem for visualizing traces. Most standard programs will will have millions of calls in the span of a few seconds (Hello World in Java has upwards of 100 000 calls or something rediculous like that!). That's far too big for any standard visualization.

So, what Diver does that is unique is it recognizes that most of the repetitive calls will occur within loops that can be found in the source code of your program. So, it searches your source code (or the source code attached to your jars in Eclipse), and looks for where calls originate from. If they occur in a loop, then only one iteration of the loop is shown at a time.

I ran some experiments on several traced programs, including Eclipse, and found that you can get upwards of an 85% reduction in the size of your visualization (depending on the program that you tracing, and on the visualization--it's particularly good for sequence diagrams) just by compacting loops in this way. That's a pretty significant reduction. The Eclipse JDK was a huge help for this, by the way. It has awesome Java indexing, and a great Java parser that made my work a lot easier.

So, anyway, if you are in the vicinity of Madrid around March 15-18 2010, come by CSMR and say hello.

Wednesday, December 2, 2009

Diver now available on Linux!

I've had a number of requests to get Diver running on Linux. So, I've spent the past few days trying to get it done. I think that I've had success, but it needs more testing.

I've tested it on a Linux machine here at the university. The above screen shot was taken by X-forwarding Eclipse to my local XP machine using Cygwin/X. The native code is compiled to 32bit, so it should be compatible with pretty much any Linux machine. All the code is statically linked, so you don't need to have any libraries installed.

It hasn't been released to the "stable" stream (whatever that means in research ;-) ). You can install it from the "developer" stream using P2: I haven't done integration tests on Windows yet.

Right now, both the Windows version and the Linux version are held in the same plug in fragment. I know it would be better to include only one of them for each platform. I'm a little confused about how to set up my features to do this though. Does anyone have any tips?

Monday, November 30, 2009

MANIFEST.MF: Apparently, you are just supposed to know this

I came across a funny error today while I was working with a plug-in fragment in Eclipse. I had a fragment that contained several resources for use in a host plug-in. All of a sudden, my host plugin was no longer able to find the resources from the fragment. I couldn't figure out why. Then I stumbled across some information about MANIFEST.MF files in Java jar collections. Apparently, there is a limit to the length that a line can be in a MANIFEST.MF file. If a line is more than 72 characters long, then it must be broken into more than one line. You can do so by breaking it on the 72nd character and continuing the next line with a space. My MANIFEST.MF file had one such line. So, changing the line from (imagine that there is no word wrap):

Eclipse-Variable-Example: Here is my really long manifest line that is more than 72 characters


Eclipse-Variable-Example: Here is my really long manifest
 line that is more than 72 characters

Made it work. I knew that you could break lines, but I didn't know that you
had to break them. Now I know.

But this leaves me questioning: the standard says that you should break the line on the 72nd character. Does that mean that my example should have really looked like this:

Eclipse-Variable-Example: Here is my really long manifest line that is m
 ore than 72 characters

? That just seems like a weird way of doing it.

Tuesday, November 24, 2009

A Use Case for Reverse Engineering

OK, so I'm going to try and motivate some of what I do with an example application. Here is the problem that I'm looking at. A few weeks ago, I had to put a little bit of code in an application to open up a web browser in Eclipse and display a web page. I had remembered doing exactly that many years ago, back during the transition from Eclipse 2.x to Eclipse 3.x when the browser became a standard component. I remembered it being really simple, but I couldn't remember for the life of me how to actually do it. I feel a little ashamed of that because it's something I should have known, but I didn't. So, I had to figure it out.

At this point, the entry might go long. So read if you are interested. Just quit otherwise.

Anyway, I started in the documentation. I did a search for "Browser". My problem is that I have so many plug-ins installed in Eclipse that help search is starting to get a little bit useless. I ended up with over 100 pages of documentation that had something to do with a browser, but I couldn't find what I was looking for. So, I decided to go to the Eclipse code.

So, I knew that if I browser was going to be opened, one had to be created. I was interested in using the internal browser, so I just went to the Browser widget, and set a breakpoint in the constructor to see who creates it. I started up a run-time workbench which had a project in the workspace containing an HTML file, and I opened it up. My breakpoint was hit, and I got this stack trace:

So, I started looking through the classes that were in there. What should I use? WebBrowserEditor? No, it's internal. WebBrowserEditorInput? Don't tell me that I have to construct an editor input, and pass it to the workbench just to get a web browser to open. No, I can't anyway since it is internal. I know that there is an easier way.

Anyway: long story short, I went down a rabbit trail of following inheritance hierarchies, and searching for references of different classes to see where they are created and accessed. I didn't have any real success, so I gave up on that train. I decided to give my Diver tool a try and see where it would take me. Note that this is a recreation of the event, and a number of the details are likely different.

So, to use Diver, I could just launch Eclipse in debug mode like I did before, but I did it using an Eclipse Trace Launch, supplied by Diver, as illustrated here:

There is a little more set up to do though. By default, Diver is set up to watch for code that I've written (i.e. the files in my source folders). But, I'm interested in Eclipse code, so I've got to change that. It can be changed in the Java Trace tab. I just guessed that the code that I was interested in was probably in a ui package, so I added org.eclipse.ui.* so that it would be included. Everything else will be ignored unless a class in an org.eclipse.ui package is accessed somewhere along the call chain:

Then, I just launched Eclipse like I normally would. I followed the same process as I did for my debug session, but instead of just stopping at the breakpoint, Diver is going to log all of the interactions that Eclipse makes around the time that I open the browser. I did that by using a little "play" button found in the Debug View: . Again, I opened the browser, and as soon as it started to show up, I paused the trace (), and quit Eclipse.

Now, it took a little while for Diver to analyze the data that it just captured. This is necessary to make Diver more zippy later on, so it's worth the wait. There were over 4 000 000 events that were logged. It didn't take too terribly long, though, and I did have other stuff to do.

Anyway, once that was done, I could see my trace in the Program Traces view:

This view shows all of the threads that were running while I was performing the trace. Double-clicking on one of them will open up a sequence diagram view that I could explore to figure out what the thread is doing. But, there were too many to try. So, I tried a different route. Notice the green dots? They mean that the trace associated with those threads is the "Active Trace" in Eclipse. It's kind of like Mylyn's tasks. If you activate a trace, then Diver will enable different features that will filter your package explorer to show you only the classes and methods that were used during the trace. I activated the trace that I just made, and went to the Package Explorer.

Again, I figured that what I was looking for probably happened around the same time that a Browser instance was created. The problem with using the debugger before was that I couldn't see everything that happened around that time: I could only see the things that occurred within the call chain to the construction of the instance. So, this time I used Diver to show me the context in which the browser was instantiated. I found it in the Package Explorer, right-clicked and selected "reveal in > main". What this did was it opened up a sequence diagram of the Main thread, and located the first method that was called on the Browser class:

Here lies another problem, though: the sequence diagram is really big. Here's a zoomed-out view of it:

I don't really want to look at it all. There are a couple of things that helped me out. First, there is a timeline on the sequence diagram:

The blue vertical bars are all the invocations of methods on the Browser class. The first one is the constructor call, so I adjusted the range on the timeline (the darker area around the first invocation) to just surround the constructor call a little bit. This filtered the sequence diagram to show only invocations that occur within that range of time. The other thing that helped me is the fact that the sequence diagram can collapse lifelines based on the packages that classes are contained in. For example, we can see here that the org.eclipse.swt.widgets package contains five classes that I'm not really interested in, so I can collapse it:

I collapsed a bunch of other uninteresting packages like java.* and sun.* and org.eclipse.core.*, etc., and got a much smaller diagram that I could start browsing. In a little while, I found another class that looked to be of interest: WorkbenchBrowserSupport. A method called createBrowser is called on it after the browser is actually instantiated. That's why I couldn't find it in the debugger:

Now, what I want to know is how to get a WorkbenchBrowserSupport, so I scanned up the lifeline, and I found this (cleaned up a little to make it look nicer):

At this point, I thought to myself, "Of course! You get the browser support by calling IWorkbench.getBrowserSupport()! I should have known that!" And I'm a little embarrassed to say that I didn't. But, my problem was solved at any rate. I just had to get an instance of the workbench (from PlatformUI), get the browser support, and create a browser. Done and done.

Now, this isn't to say that I solved a problem that couldn't be solved using other Eclipse tools. I'm sure that if I were willing to spend some time with the Java Search dialog, I could have found the reference to createBrowser eventually. But, I always get those search queries wrong and I come up empty. I'm just inept at using Java Search. So, here I had an alternative using some of the reverse engineering support of Diver.

I hope that this post motivated some of the neat things that can be done with a little bit of extra tool support. You can go ahead and try it if you like. It's all free.

That's it for now. I'll see you next time.

Monday, November 23, 2009

Hey, I've landed!

I just noticed that I've made it onto the Planet! I'm glad to be here, and I'm excited to meet you all! :-)

In case you've just joined me, this blog is going to be about reverse engineering in Eclipse. I am a researcher at the University of Victoria, figuring out how to make reverse engineering more accessible to the average coder. So, I'll be talking about some of my research experience to. I hope you enjoy it!

What is Reverse Engineering?

OK, so I've said that this blog is going to be about reverse engineering in Eclipse. That leads to the question: "What, exactly, is reverse engineering?"* The answer might be broader than a lot of people think.

A lot of people think that reverse engineering has to do with taking prebuilt software, running a decompilation process over it, and trying to copy the original software for personal gain. That can be some of reverse engineering, but it isn't the bulk of it. Academic literature on reverse engineering actually has very little to do with that.

The reality is that most developers do some kind of reverse engineering all the time. Whenever we think to ourselves, "I wonder how/why the program did that?" we are asking a reverse engineering question. We are trying to figure out from a previously engineered system the reasons that it was written the way it was, and why it does what it does.

I think that this has become a lot more common place with the boom of Open Source Software and great tools like Eclipse. Personally, if I want to find out what a piece of code like StringBuilder.append() does, I'm just as likely to simply press F3 in Eclipse and go read the code as I am to go off and read the API documentation. If I want to understand how to use class X in my software, I'm just as likely to use Ctrl-Shift-G and search for references for how other people have done it, as I am to search through mailing lists or, again, go to the documentation.

These are all microcosms of reverse engineering, and we developers do them all the time. But there are still some pain-points, and things that could be improved. One big thing is that we have gotten used to using debuggers to set breakpoints and step through problem code. But what do you do if you don't know where the problem code is, and so you can't set a breakpoint? That's where some more advanced techniques can come in handy. I'll dedicate my next post to one possible technique in Eclipse.

*Personal point of interest: many people would have said here "That begs the question...". I just thought that I'd add an "educational" note about that, because it is something that kind of gets to me when I read it. I did a minor degree in philosophy, and begging the question is actually a logical fallacy in which a person tries to prove a proposition by making reference to the original assumptions. There is a common example that shows up when people try to market products:

1. The best products are the ones that most people buy.
2. How can you be sure that they are the best?
3. Because the most people have bought them, of course!

Step 3 assumes the axiom is statement 1, and so does not actually answer the question in statement 2. Statement 2 was questioning the validity of 1, and 3 tried to answer 2 by resorting to the assumption (1) which was in question (2). Hence begging the question.

So, in reality when people say "That begs the question", they really mean "That leads to the question". I just wanted to educate the masses.

So here is what this blog is really about

Wow, I haven't really kept up with the whole blogging thing here. I just noticed that my last post was in March. I plan on trying to do better than that from now on.

Anyway, maybe I need to take this blog in a new direction, or in any direction really. So, here is my plan. I'm going to start blogging about my work with Eclipse more. I'm a masters student at the university of Victoria. I've been working for the past year on problems related to reverse engineering. The inspiration has really come from the fact that I worked for a while as a programmer, writing Java software. Everyone knows how much time is spent in debugging software, and just scanning over source code trying to figure out how things work. So, I got the opportunity in my research group here--the CHISEL lab--to do some work on figuring out how to make that sort of process easier.

You may have heard a little bit about the CHISEL group. It's the group that Ian Bull graduated from. We do quite a lot of work in Eclipse because it is such a nice free and open platform. Ian worked on the Zest project as part of his Phd. work. I was actually lucky enough to help him a little bit in the coding and design of that project--we are very cooperative in our group.

Anyway, long story short, I've been working on my own project in Eclipse called Diver. It stands for Dynamic Interactive Views for Reverse Engineering (I know, the last two letters are reversed--I was trying to be clever). It's a set of views that makes use of Eclipse's powerful Java Development Tools, and makes it easy to apply some reverse engineering techniques to day-to-day programming tasks. I think that Eclipse is the perfect environment to get this kind of work running in because it is built on such a good plug-in platform. It has been quite easy, in the whole scheme of things, to get my ideas integrated into the Eclipse platform, and to make use of the many powerful components that have already been implemented. For example, to do some of my Java reverse engineering, I didn't have to implement my own parsers and indexers into source code: I could just use the JDT. It's been a great way to get started.

So, here is where this blog is going to go: I'm going to start talking a bit more about my project and about how reverse engineering can be done in Eclipse in general, and about some of the challenges that I came across while trying to implement my solution. Hopefully it will be helpful to people out in the Eclipse community.

Monday, March 23, 2009

Linking boost libraries to DLLs in mingw

So, I've recently needed to use the boost libraries to create a DLL for use with the JVMTI. I know that it seems like a strange thing to have to do, but the long and the short of it is that I needed portable thread and socket libraries. So, I couldn't just depend on the header file linkage (like you would have with boost's smart pointers). I needed to link the compiled boost libraries to my DLL.

Now, here comes a problem: JVMTI won't automatically load dynamic libraries that are required by your agent. The obvious solution is to try to statically link boost to the agent DLL. I didn't think that this would be a problem. But, I'm using mingw for my C++ work (it has good support with Eclipse's CDT, which I use because this is JVMTI, and Eclipse is my Java environment). I spent a good day trying to figure it out.

Here is the trick: when you build your boost libraries using bjam, it defaults to naming the static libraries as ".lib" becuase you are using windows. However, mingw is a port of gcc, which expects ".a" libraries. So, g++ won't link to the .lib's. The solution is to simply go into your build directory, and rename all of the .lib files to .a files. And viola, you get linkage. It's crazy, but true. Hopefully this can save some other people some pain.

Thursday, February 5, 2009


This is the beginning of my blog experience. This blog is meant to offer updates and insights that I have relating to my research and work in the Department of Computer Science at the University of Victoria. Check out my research page here: