Monday, September 5, 2011

When Rigor becomes Rigor Mortis

I've worked in the field of computer science and software engineering research off-and-on for almost ten years: first, as a software developer and research associate and then, in the last 3 years, as a masters student. In that time, I've been able to work on a number of very interesting and innovative projects. That is what I think is a great advantage of academia. At least in theory, academics are meant to be free to investigate new ideas, new theories, and new technologies without the obligation to provide a product that will generate profit.

However, in the time that I've been involved in academia, I've learned enough to become a little critical of it as well. There is a well-known mantra that in academia one must, "publish or perish." If you don't run studies and publish papers, then you won’t make it as an academic. Since grants and fellowships are awarded based, in part, on the curriculum vitae of an academic, if the publications dry up, then the funding will as well. So there is a kind of “market value” that is put on research. It is based on what can be published, and it is adjudicated by conference chairs and journal editors.

I've reviewed my own fair share of papers. One thing that I and other researchers look for is rigor in research. Science needs rigor. It is important that a paper does not claim more than it can actually demonstrate. That means that the researchers have to work hard to design, implement, and run experiments. And they have to be careful about what they claim.
It’s my belief that modest claims in research are good claims. I think that in computer science and software engineering, we often research scenarios that are far too complex, with far too many variables to make large claims. Unfortunately, it is difficult to publish modest claims. They don’t seem exciting enough. It seems that I see a lot of reviews that want papers that demonstrate how a new technique or technology is better than all other predecessors or that helps all people in all situations. The average researcher just isn't up to the task. It is an impossible one.

It requires experiments that are too large and take too long. To make such strong claims, one needs to produce experiments that have statistical significance. When you are working with a new technology, that means that you need to pull from a large pool of unbiased, randomly selected participants that you can ask to try the technology, so that you can measure their usage of it, ask them about their reaction to it, and try to form some kind of generalizations and a theory about why the technology does or doesn’t work. The problem is that most researchers don’t have access to a large pool of participants, and the participants almost never can be randomly selected.

Even if these criteria can be met, the problem of gathering and analyzing data from these kinds of scenarios can become almost insurmountable. For my thesis, I, my supervisor, and several others designed and performed a study that involved ten participants in total. With only ten participants, it took us several months to transcribe and review the videos, analyze logged data from the application, etc., etc. If we had as many participants as we needed to gain statistical significance, it would have taken us years to complete the research.

In such a fast-moving realm as computer science and software engineering, this kind of long turn-around time is very frustrating. It makes research seem to lag behind the innovations of commercial and Open Source projects. It also makes it difficult to do future research. Since the results lag behind the results of production software, it is difficult to convince professionals to participate in experiments. It is a large cost (in terms of time and money) for what they see as a small reward. This will make it more difficult in the future to perform academic experiments that are able to make strong claims.

It might be time for academics to re-evaluate the goals of computer science and software engineering research. What kinds of problems are they trying to solve, and what is really important? From my perspective, my own research has been a success. I started my masters not as an academic, but as a software developer. I had problems in my day-to-day programming that needed solving. My masters work gave me the opportunity to work with some excellent and smart people and come up with a workable solution. I was able to produce the Diver tool and hand it to the Eclipse community, which has always been a community that fosters innovation and excellent technical solutions.

The results of the work, however, have been difficult to publish. In spite of the recognition that Diver got by being named a finalist for "Best Developer Tool of the Year," the results of our research have yet to be published outside of my masters thesis. I get the distinct feeling that I can make a bigger impact by working toward building real solutions for real developers than I can by trying to convince conference chairs that the solutions are worth-while.
So, I’m leaving the Diver project in the capable hands of the CHISEL group at the University of Victoria, and moving onto a new chapter in my life. I've been offered a job at Microsoft and I start tomorrow. It has been great working at the University of Victoria. There are some really smart, creative, and inspiring people there. It has also been great working within the Eclipse community. I truly believe that the work done with Eclipse has changed the way that people develop software. The community is great and so are the products.

In the end, I'm really just interested in designing and implementing good solutions to real-world problems. Microsoft has offered me what seems to be an exciting opportunity to do just that. As an added bonus, it pays better than the wage of a masters student :-). So, this will be my last post to Planet Eclipse. Thank you everyone for your support over the years and I wish you all the greatest blessings.

The Diver project is still available on Sourceforge. If you like, you can find my masters thesis here.

Monday, August 8, 2011

New Version of Diver Available


So, I just defended my Masters thesis today, and I figured that I would celebrate by releasing a new version of my thesis project Dynamic Interactive Views for Reverse Engineering. It's a tool that allows you to dynamically trace the execution of your software, and focus in on particular features and bugs to help you analyze and understand it.

First of all, I'm going to apologize for not being very active around here since EclipseCon. I've been too swamped with finishing off my thesis. But, in the last little while, I've been able to work on Diver and make some updates. Unfortunately, since I am finished my Masters now, I won't be personally releasing new versions Diver anymore. But, it is remaining with the CHISEL group at the University of Victoria, and it will hopefully pass to other capable hands. More about that, though, in a future post.

Anyway, there are a couple of enhancements to Diver, including the ability to trace your JUnit tests, and the ability to view sequence diagrams that are generated directly from source code without the need to create an execution trace. More information can be found on the Diver web page.

You can install Diver using the Eclipse Marketplace, or by following the instructions on the Diver web page.

Friday, March 25, 2011

Reflections on EclipseCon

Thanks to everyone at EclipseCon this past week for making it a great experience. JaxEnter asked me to write a short reflection about my experiences. I don't want to steal traffic from them, so you can see the article by following the link above :-).

The article is technology-centric. It was difficult for me to choose from among the many great things that are going on with Eclipse since it is such a vibrant and innovative community. The technologies in the article have heavy support from the top-level projects, so they will be interesting to follow. Also, in reality, they only reflect my own interests and experience and I could not have possibly seen everything going on at EclipseCon. That isn't to diminish any of the other great technologies that are built on Eclipse. There are a tonne of them that have become so integrated into my daily work, that I barely even notice them anymore. That's a real compliment, I think.

The best thing, really, though is the community. I have to admit that it was a little difficult to get to talk to people before I gave my talk on Wednesday. After that, though, I had a few great conversations. I'd like to make a shout-out to the guys over at Ericsson. For the past while, we have been working on the same set of problems, with support by the same funders. We had to go all the way to California for EclipseCon before we had heard about each other, though. It's a funny world.

Oh well, time to head back home. Good-bye cold, rainy California. I'll be going back to the rain and palm trees of Victoria :-).

Monday, March 14, 2011

Tracing Web Apps Using Diver


Diver is a tool designed to help Java Developers understand how unfamiliar, or long-forgotten, software works. It lets developers easily trace the actions of Java Software and analyze it using an advanced Sequence Diagram View and some workbench filters that help focus on the software artifacts associated with particular features of the software.

With more and more applications moving to the Web, I've been asked a number of times, "Can Diver trace web apps?" I've always told people that it is possible to do, but I haven't given detailed instructions on how to do it. This post will hopefully remedy that.

First, a few notes. The version of Diver used to illustrate the points in this post is 0.3.1. It isn't an official release yet because there is still some testing to be done. I've had to do some tweaking to Diver, though, to make tracing web apps a little easier. So, you can get your copy of Diver 0.3.1 by pointing p2 to http://diver.svn.sourceforge.net/svnroot/diver/Development. Another thing: Diver only traces Java, not JavaScript. So, Diver can't be used to help get at any JavaScript errors that are running in the browser. It can, however, trace Java apps that have been deployed on a Java-enabled web server such as Jetty, Tomcat, or JBoss. That includes code invoked from JSPs. In this post, I'll be using JBoss 5.0.1GA. There is no particular reason for that. I've used Diver to trace applications deployed in Jetty as well.

Install And Deploy Your Web App

The focus of this post is on tracing web apps, not on setting up web servers. So, I assume that you have your server ready to go. It's probably best, though, that you run on a small test server on your localhost rather than on a deployment machine. That just keeps things compact and easier to manage.

Diver works best if you have the source code for that app. It isn't strictly necessary, but it makes the tools that Diver offers a lot more powerful. At the very least, you have to be able to deploy your app into your own web server. You can just build and deploy your web app the way you normally would. For example, in JBoss, this would typically involve copying a WAR file into the jboss_root/server/server_name/deploy directory. When you are tracing your app, Diver will actually be tracing the copy that has been deployed on your server, not the version that is in your workspace. So, after you make any changes to your code, be sure to rebuild and redeploy before running your trace. Then, Diver will be able to match the calls to various methods/classes in the deployed WAR to those classes in your workspace, making it a lot easier to find things.

Preparing for Launch

When tracing a Java Web App, what we are really doing is capturing specific events that happen inside a Java web server instance. So, the application we will be tracing is the Java Web Server. Ideally, Diver would be able to interact with the Eclipse Web Tools Platform so that you could just fire up a trace of a web server launch. Unfortunately, I haven't gotten that far yet. Fortunately, though, programs like JBoss are really just Java applications, so we can use the Java Application Trace functionality of Diver to grab a trace, but it will take a little set up first.

The Diver Java Application Trace is just an extension of the JDT Java Application Launch. The JDT requires that the applications it launches are visible in the workspace, so Diver requires the same. That doesn't mean, though, that you have to have all of the JBoss source to get started. You can launch directly from your installed JBoss instance. To get JBoss to run, you just need to make the right jar files visible. This is how you do that.

I suggest that you make the appropriate jars visible in their own project. You could, possibly, create dependencies to JBoss or Jetty, or what-have-you, in your web app's project, but that could cause build problems or runtime errors if your web app needs its jars to be isolated from the server. I just create a small Java Project called JBoss:


Once the project is created, you need to make the JBoss launch jar visible. To do that, just select the JBoss project in the Package Explorer or the Navigator and choose File>Properties <Alt+Enter> from the menu. Jump over to the Libraries tab of the Java Build Path settings page, and select Add External Jars:


The jar that you are looking for is run.jar. It's just the JBoss launcher. On my computer, it is found in C:\jboss-5.0.1.GA\bin:


That's all that you need for preparation. Now, we can get onto the launch.

Setting Up The Trace

Now that the JBoss launcher is visible to Eclipse, we can get onto running it to gather a trace of our web app.  Open up your Run or Debug Configurations dialog, and create a new Java Application Trace. In the Main  tab, set the Project to the one we created earlier. The Main class that we will use is just org.jboss.Main. Simple enough:


Your web server will likely require a number of arguments to be passed both into the program and into the VM. Go to the Arguments tab. JBoss requires the following arguments:


Program Arguments:
-c <server> where <server> is the name of the configured server that you deployed your web app into. I typically just use default one supplied by JBoss, so my program arguments are -c default.

VM Arguments:

-Xshare:off -Dprogram.name=run.bat -server 
-Dorg.jboss.resolver.warning=true 
-Dsun.rmi.dgc.client.gcInterval=3600000 
-Dsun.rmi.dgc.server.gcInterval=3600000 -Xms128m -Xmx512m  
-XX:MaxPermSize=256m

-Xshare:off is a Sun/Oracle specific VM argument that is there for Diver's sake. It tells the VM that this launch will not share its classes with anyone (the Java VM allows different processes to share the same loaded classes unless this flag is set). If you aren't using the Sun/Oracle VM, get rid of this argument. The other ones are there for JBoss. The really important ones (I think) are -Dprogram.name=run.bat (or run.sh if you are running on Linux) and -server. You can set the heap and Perm Gen sizes according to your liking.

Working Directory:

/jboss_home/bin. On my computer, this is just C:\jboss-5.0.1.GA\bin. This has to be set so that JBoss can know where it is running from and resolve all of its own resources.

If you are wondering how I discovered what all of these arguments should be, I did it by inspecting the run.bat script that is used to launch JBoss. The script just sets a bunch of default variables, and I used those. Another thing that you should be aware of is that you need to have your JAVA_HOME environment variable set to a valid JDK. A normal JRE might work, but (depending on the version of your server) it may not be able to compile JSPs unless you have the JDK. I'm sure you already knew that, but I'm just throwing it out there for good measure.


Tweaking The Trace

That is all you technically need to do to get started. Now, you can just select the Run button and start gathering information like you normally would with Diver (see the Diver documentation). But there are a few things that you can do to tweak the trace and make sure it is efficient. Web apps can be slow as it is. Tracing them introduces i/o overhead that we would like to minimize. Go over to the Java Trace tab of your configuration, and set it up to look something like this:


First of all, you are likely only interested in the things that your web-app is doing, so I suggest selecting the Set Filter Manually option, and adding only the packages that are a part of your web app to the Inclusion Filters area. By Default, Diver will analyze the trace and only watch for methods defined in the project that is named in the Main tab of the launch configuration. In this launch, there is nothing but the JBoss launching jar, so the analysis would end up being mostly useless unless you set the filters manually.

I also suggest making sure Pause On Start and Apply Filters at Runtime are checked. Pause On Start is the default, and it just means that none of the start-up process of the program will be traced. You can define what actually gets recorded in real-time using the Eclipse Debug View as described in the Diver documentation.

Older versions of Diver (0.2 and earlier) recorded absolutely every method call, and "rejected" data based on the defined filters during a separate analysis step. Since version 3.0, Diver allows you to reject that data before it gets stored, which can speed up your traces and your analysis by many, many times. Make sure Apply Filters at Runtime is checked to enable this speed up.

Get Going

There you go, that is all you need to get a trace of a web app using Diver. Now, you can record and compare traces to focus in on specific bugs or features in your app and use Diver's sequence diagram to analyze what is going on. The Diver documenation, and the video tutorials tell you how to do all that fun stuff:


There is just one more small bit of housekeeping that you might want to be aware of. Launching a web server in this way doesn't give you a clean, interactive way to shut it down. You can always just use the "Stop" button in the Debug View to force it to quit, but that could leave your server in an undefined state. I suggest opening up a console, and using your web server's shutdown command. For JBoss, all you have to do is type shutdown -S.

That's all for now. If you are out at EclipseCon, make sure to come to my talk where I'll be presenting Diver and how it helps users of Eclipse understand their software. Happy coding!

Tuesday, March 1, 2011

Diver is a finalist for an Eclipse Community Award


I've been swamped with work lately, so I'm afraid that I haven't been able to devote much time to blogging. But I got news today that my Diver tool has been selected as a finalist for the Best Developer Tool category along side the excellent Pydev environment for the Python language. Diver has really been my baby for the past couple of years. I've been working hard on trying to solve the problems associated with understanding software. I hope that Diver can help developers in that respect (I know it helps me :-) ). Even so, it is a great honour to be selected as a finalist from among more than 40 other excellent solutions. Eclipse has such a vibrant developer community. There is a solution out there for almost every problem.

If you are going to EclipseCon, be sure to check out my talk to find out more about Diver and how to incorporate reverse engineering into your everyday work to improve your understanding of software.

Tuesday, December 21, 2010

See you at EclipseCon!

I felt like I won an award this morning when I found this in my email:


Dear Del Myers,
        
We are pleased to accept your proposal "Put It In Reverse: Using Eclipse to
Understand Code that has Already Been Written" for EclipseCon 2011. 

The Program Committee received many excellent submissions, and we're 
very happy to include "Put It In Reverse: Using Eclipse to Understand Code
that has Already Been Written" in that select set.

Thank you for helping to make EclipseCon 2011 the best one yet!

I'm certain that EclipseCon will be excellent, and I hope to do my part in making it that way. So, everyone come and check out my talk if you have the time: https://www.eclipsecon.org/submissions/2011/view_talk.php?id=2099

Tuesday, December 7, 2010

Interview about Diver


The online magazine JaxEnter.COM recently interviewed me about my tool Dynamic Interactive Views for Reverse Engineering (Diver), which is an open source tool for bringing reverse engineering processes into the hands of us ordinary developers. Wondering what this tool has to offer? Check out the interview: http://jaxenter.com/diver-interview-32838.html