YOUR FEEDBACK
johnpetersen wrote: Great post. You hit some good points, and hopefully me sending this post. It wil...

SYS-CON.TV
TOP MICROSOFT .NET LINKS


Applicability of the .NET Platform to Bioinformatics Research
Use .NET to heal

A current look at the field of bioinformatics will reveal that it is a field that is largely dominated by the Linux operating system, as well as by programming languages such as Perl, Python, and Java. Windows and its associated native application development platforms are not in widespread use among present-day bioinformatics practitioners. In fact, the usage of Linux and other open source technologies will likely remain the dominant platforms upon which most novel and/or large-scale bioinformatics research is conducted. Scientific computing of all types has deep-seated roots in Unix and its derivatives, and as a result is very much dependent on code bases that are written with *nix platforms in mind. Many scientific applications are written for High Performance Computing (HPC) architectures or distributed computing environments, and such applications will often need to be run for lengthy periods of time, thus making OS stability an important factor. While Windows operating systems have made inroads in the server markets, the HPC market is still devoid of most Microsoft-based products. Practical issues such as these aside, however, most bioinformatics practitioners are highly in favor of open source ideologies and technologies, since the free exchange of ideas is valued as one of the fundamental building blocks upon which scientific progress is based.

That having been said, in the not-so-distant future there will be an increasing demand for Windows-based bioinformatics applications. Bioinformatics is a field that has experienced rapid growth over the last decade and has in many ways revolutionized the way certain aspects of the biological sciences are conducted. Bioinformatics methodologies were of critical importance to completing projects such as the genome assembly portion of the Human Genome Project. Advances in areas such as genome sequencing, proteomics, microarrays, as well as advances in other forms of biological data collection are generating voluminous amounts of data and thus rapidly changing the field of biology from a purely experimental science into an information science. As this change occurs, many bioinformatics tools become adopted by mainstream biologists and leave the realm of specialized knowledge, thereby requiring the skill set of a trained bioinformaticist. Perhaps the most prominent example of this is the widespread adoption of the BLAST algorithm, which performs alignments between DNA or protein sequences based on the similarities of their composition. When the algorithm first appeared it was mainly a tool used by individuals who were interested in the computational analysis of biological sequences, yet currently it's a staple technique employed in many research projects that would not be considered at all computational in nature. In fact, today BLAST is among the most widely used of all bioinformatics applications, and the major interface for utilizing BLAST is the Web application hosted by the National Center for Biotechnology Information at www.ncbi.nlm.nih.gov/blast/index.shtml.

The popularity of the Web interface of BLAST illustrates some important points about mainstream application acceptance among biologists. The BLAST Web interface allows users to interact with a graphical interface that provides them with labeled text boxes and drop-down menus, which simplifies the interaction between the user and the application. In contrast, there is also a stand-alone BLAST client with a command line interface that can be downloaded to run off local sequence databases, or that possesses the ability to interact with a Web API via a Perl script or equivalent. Professional bioinformaticists who are comfortable with CLI input or scripting often prefer these Web application alternates due to the greater degree of customization that is possible or the ability to automate large jobs. Still, these are skill sets that the average experimental biologist does not possess. Proficiency with the Windows OS and Windows-based applications, however, is commonplace among biologists, and like the Web interface, Windows applications provide a graphical means for user interaction.

As time passes and as biologists continue to amass large data sets, the desire for biologists to conduct bioinformatics-type analyses on their data sets will also grow. Many algorithms and techniques will go the way of BLAST and leave the realm of specialized knowledge, thereby achieving mainstream usage. Unlike BLAST however, not all of these methods will be best suited for development as a Web application, but would be better if offered as a desktop application. This is where there will be a newfound need for bioinformatics application developers to shift development efforts to a Windows platform, so that biologists can use the applications in an environment and layout that is familiar to them. The need for simplified access to bioinformatics applications has also been recognized by the Apple Corporation, which advertises that many bioinformatics tools can be run due to its Unix core, but that the OSX desktop can make the experience more user-friendly. While perhaps not the best platform for the development of the most computationally intensive applications, the Windows environment has demonstrated itself to be suitable for many types of bioinformatics analyses. One of the most interesting examples of this is the recent demonstration by Microsoft Research that code found in the MS AntiSpyware application could be used to find genetic patterns in HIV. Getting more of these types of tools into the hands of biologists would greatly accelerate the pace at which many types of research findings could be made and would enhance the ability of biologists to tackle pressing biological problems such as disease, drug resistance, and bioterrorism, to name a few.

The question is how to facilitate the development of such applications. I believe the answer lies in looking at how the present-day bioinformatics community operates, and in transferring some of that ideology to a Windows development platform. The widespread acceptance of open source methodologies within bioinformatics is often credited as a factor that contributes to the rate at which bioinformatics researchers are able to produce new tools and analysis techniques; a good of this is the BioPerl library of modules. Basically the BioPerl project allows bioinformaticists to contribute code in the form of a Perl module to the project, and other bioinformaticists can then download the code and freely use it within their own applications. This allows researchers to keep from reinventing the wheel and permits them to focus more on the novel scientific aspects of their project rather than on coding routine tasks, which is a concept that is not so different from the classes that make up the .NET framework.

This suggests that the development of an increasing number of bioinformatics applications for Windows could be greatly facilitated if a .NET class library consisting of specialized bioinformatics classes were developed. For example, such a library may contain functionality that computes the properties of protein or DNA sequences, such as in the segment of example code provided in Listing 1, which calculates the molecular weight of a protein based on its amino acid sequence (see Figure 1). Standardized libraries are often especially important in science where reproducibility is of key importance, and having applications based on a common set of underlying functionality is one way of ensuring this. This goal may even be furthered by creating a class library that could be used interoperably with the Mono project, since this would provide the ability for the functionality to be reproduced in a more platform-independent manner. Moreover, it is important for the library to be developed in an open-source manner because modifications and contributions by the scientific community will be imperative. The needs of scientists are constantly changing and the field of bioinformatics is quite diverse. It would be difficult for a single development team to develop a library with functionality that is widespread enough to attract a cross section of all bioinformatics researchers. The community-based development approach would help to ensure that the library had the requisite diversity and that as the field advances, so too do the classes that compose the library.

Summary
In all, bioinformatics is a field of research that has undergone a rapid explosion in terms of the numbers of tools and techniques that it has produced. While much of this progress can be attributed to the adaptability and flexibility of open source technologies such as Linux and Perl, it is important for bioinformatics professionals to consider that disseminating their bioinformatics tools to users can be as critical to scientific progress as developing new tools. A key way to facilitate the widespread dissemination of bioinformatics applications to biologists will likely be the development of an open sourced .NET class library to serve as framework for Windows-based bioinformatics applications.

About Christopher Frenz
Christopher Frenz is the author of "Visual Basic and Visual Basic .NET for Scientists and Engineers" (Apress) and "Pro Perl Parsing" (Apress). He is a faculty member in the Department of Computer Engineering at the New York City College of Technology (CUNY), where he performs computational biology and machine learning research.

YOUR FEEDBACK
.NET News Desk wrote: Applicability of the .NET Platform to Bioinformatics Research. A current look at the field of bioinformatics will reveal that it is a field that is largely dominated by the Linux operating system, as well as by programming languages such as Perl, Python, and Java. Windows and its associated native application development platforms are not in widespread use among present-day bioinformatics practitioners. In fact, the usage of Linux and other open source technologies will likely remain the dominant platforms upon which most novel and/or large-scale bioinformatics research is conducted. Scientific computing of all types has deep-seated roots in Unix and its derivatives, and as a result is very much dependent on code bases that are written with *nix platforms in mind.
MICROSOFT .NET LATEST STORIES
OpenSpan and TIBCO have announced a technology and business partnership designed to extend TIBCO solutions to desktop environments. The partnership will enable TIBCO Service-Oriented Architecture, Business Process Management and Business Optimization solutions to more rapidly integrate...
In a move that looks tailor-made for an antitrust suit, Microsoft says it’s going to give away a consumer security kit that it’s building code named Morro. It should be available in the second half of next year – probably more like mid-year. The freebie widgetry is supposed to de...
Tidal Software has announced Intersperse 8.0, a product that monitors J2EE and .NET applications and their transaction component performance to produce meaningful metrics for managing applications and high-level business processes. The product leverages a combination of lightweight Ja...
DataGuise has announced their first masking in place solution for multi-database environments such as Oracle, Microsoft SQL Server, and others. The dgSolution Suite provides secure masking of database content and is designed for the highest level of flexibility and functionality across...
The BlackBerryR Technical Webcast Series is designed to help BlackBerry administrators better manage and leverage the capabilities of their BlackBerry solution. Each webcast is packed with detailed technical information, covering topics that are relevant to you. Our on-demand webcasts ...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE
BREAKING NEWS FROM THE WIRES
Simba Technologies Inc., industry's choice for standards-based relational and multi-dimensional data...