Wednesday, February 28, 2007


The subfield "NMR of proteins" is so visible that makes the rest of the NMR field almost invisible. I suspect at least one of the younger researchers of the subfield even ignore that NMR can be useful for more that studying biopolymers. Until a decade ago these guys were almost the only ones to write software for NMR processing. I have already dedicated two posts to the survivors NMRPipe and Felix. Afterwards they realized that what they really needed was, instead, a tool to manage their lists of cross-peaks or, as they say, to analyze their spectra. Many laboratories have written their own programs which are, quite likely, a constant presence on their monitors, just like an IDE for a programmer. A generous list of these products can be found on the Sparky web site, at the University of California, San Francisco. Unfortunately a good number of links are broken (call it darwinian selection; NMR-software paleontology has already become a rich subject).
The good thing about Sparky is that really anybody (having an internet connection) can install it. Registration is required, but there is neither a protection for the program nor a restricted access to the download area. Registration itself is free and easy. And if you work in the industry it makes no difference (what a breath of fresh air!). Installing Sparky on my iMac has been much easier than I had supposed it to be. The program requires X11, and running it on the Mac entails a few annoying details that I will not tell here about. (Yes, the user experience suffers).
The manual contains very few pages, is written in simple English and requires no specialistic knowledge from the reader. Here and there you encounter an helpful advice, like: "In the red-blue scheme the lowest contour level is red and the highest level is blue with intermediate levels having intermediate colors. This can help you see peaks when looking at highly overlapped regions that are a mass of contours, or when you wish to have low contour levels that show a lot of noise". By default all positive levels are red and all negative levels are green. Simple means Great!
You can start using the program from day 1, but it will take days to become proficient with it. Sparky comes with a number of Python extensions and more can be added, of course. Mastering all of them is probably useless, but mastering the accelerators (keyboard shortcuts) is extremely useful.
It's apparent that Sparky has accumulated years of working experience and there's no need of my rating here (I have no protein to analyze, anyway). Most of the program is about book-keeping the user's lists. There are only a few computation routines, all confined to the task/puzzle of integrating the volume of the peaks.
The manual says it's better to fit the peaks (with a gaussian or with a lorentzian shape) than to sum the discrete points. After the fit, you get a numerical value for the volume and a rms error. I have experimented with a nice ROESY, with a good S/N ratio, and I have got giant rms values (easily above 100%). Unfortunately I don't understand why: in theory I can simulate the calculated shape, but with a different program. Being myself an ignorant, I feel more secure integrating in the traditional way. The big problem is that traditional integrals are underestimated (the tails, going below the lowest contour, are skipped, and with them you renounce to 10-20% of the area). If all the integrals are similarly underestimated it's OK, but if I mix them with fitted areas (hopefully unbiased, but certainly not with the same bias), what am I going to achieve? I stop here, but the subject deserves a book.
There are also facilities to navigate through 3D spectra and to compare different spectra. If your need is to display a spectrum, however, you are better served by iNMR. In its latest incarnation (2.1) the latter sports 5 plotting modes: intensity, contour, stacked, arrayed, chessboard, while Sparky has 1 mode only (and does not handle 1D spectra...). iNMR has trivially intuitive 1-key triggers (and, if you keep the key down, the action is repeated), while Sparky has 2-key "accelerators".
Sparky has been written to perform a specific task. When a program has a single task to do, it can do it better.
"Also output formatted for structure determination with DYANA, or distance restraint calculation with MARDIGRAS can be generated. Finding peaks and making assignments is done manually through a graphical user interface. Sparky does not do spectral processing or distance geometry, molecular dynamics, or make toast." (sic)

Monday, February 26, 2007


Today there is a new site to visit:
The newest site corresponds to the oldest living software because, as far as I know, all the older alternatives either changed their names or ceased to exist. The site is elegant and readable also if it tends to be irritating: almost each page begins with the sentence "Felix is the industry standard software program"... First of all I don't understand what the expression "industry standard" really means, if it means anything at all. Second: this application is focused towards NMR of proteins which is, according to my very personal opinion, a purely academic exercise. I know that pharmaceutical companies also have their feet in the same field, but this happens because even pharmaceutical companies like to perform some academic type of effort. To the best of my knowledge this kind of NMR has not fulfilled its promises.
Academies have developed their own equivalent software, which is usually free, so the space for a program like Felix is narrow, as demonstrated by its tormented story. In the years (almost 20), it has belonged to Hare Research, to Biosym, to MSI, to Accelrys and from 2007 to Steve Unger. The mere fact that it still exists is a great merit. If NMR remains in fashion the story can go on for many years. From the number of brands, and from the price, I realize that in this kind of companies there are much more generals than soldiers.
I have personally used Felix in the summer of 1991, because it was free for academies (at least for the one when I was at). I hardly remember what I ate yesterday, and 16 years are too much for my memory. I simply remember that, at the time, I liked Felix. Even if I remembered more it would be irrelevant, because the present product is certainly different. Reading the on-line documentation, there is nothing impressive on show (it lists: "States-TPPI spectra", "Oversampled Bruker digital data", "Solvent suppression"... and even "Fourier Transforms" and "Interactive phase correction", all things that are ubiquitous today and even yesterday). The complementary modules are more promising: "The sophisticated algorithms [...] make FELIX Assign invaluable during the complex task of spin system assignment. FELIX Assign can literally save years of effort". This sentence makes me wondering why Accelrys let Felix go away.
When the web site will be complete, it will be apparently possible to download a trial version. Presently you need to send an e-mail.

Tuesday, February 20, 2007


Only today I discovered a mature software called MRUI. It's free (requires a not simple registration), written in Java and either you already know it or you need not it.
MRUI is about in-vivo spectroscopy and, according to the web site, "more than 780 research groups worldwide in 53 countries benefit from" it.

Thursday, February 15, 2007

It's (not) a dirty job, but somebody's gotta do it

You may wonder why a former researcher spends his time to decide if the integral labels should be written horizontally or vertically. It happens, you know, that it's better to spend an evening to make something stupid like this (but that will certainly be used) than to spend 20 years only to write hundreds of scientific articles with no practical consequence for the health of mankind. Obviously, the risk is part of the game and both activities are welcome.
A couple of posts ago we witnessed the superiority of vertical labels when the spectrum is very crowded. Horizontal labels have their own merit, too: they are more readable. Last night I have worked a little to improve them. Here was the starting point:

This solution is not acceptable. A first remedy is to displace the labels, using the same algorithm implemented by peak-picking and by the vertical labels. I have found a program on the net that does it, in this way:

I like the idea, but the implementation is perfectible. I can't tell where a label ends and where the next begins. It's also hard to relate the green intervals to the corresponding peaks: my eyes get soon tired of this exercise. I still prefer to split the integrals in two rows, because the brackets stand out better. My improvement has consisted in displacing the labels. They can automatically displace themselves with a trivial algorithm, as depicted here below:

While the algorithm in the middle picture is sophisticated and slow (every region is aware of every other region), in the example at the bottom every region ignores all other regions. I start drawing integrals from the right (in increasing ppm order). When drawing a new integral, I know if there is already a label on its right, but ignore if there will something on the left. Nothing prevents me to adopt the more sophisticated (global) approach. My priority is different, however: I want the displaced label to keep a minimal overlap with its bracket. In this way I can do without the oblique handles. I can still combine the global algorithm with the restrained displacement, but I am already glad with the small advantage that I have gained so far.
The picture shows it effectively: if the label "4.069" (the first to be drawn) knew what was to follow, it would have displaced itself more to the right and there would have been more white space between "27.64" and "21.17". This can only been done with the slower global algorithm.
In conclusion: most people have survived well, until today, with centered labels (top picture). Now their work will be easier with auto-displacing labels (bottom picture), even if the implementation I have chosen is sub-optimal. It hasn't been so far such a boring job, after all.

Friday, February 09, 2007

Shimming does't hurt, but...

My life has taken a strange turn in the last two months, and it seems that I have forgotten the blog. I wrote at the very start that it's like a book, which means that it has a beginning and an end. I feel like I have already said the really important things, but there's still a lot to say. In this moment something else has taken my personal priority, but I am confident I'll find the time, soon or later, to write everything down.
Today I am registering the rare event of somebody posting a comment. Being it on a hidden page, and being it interesting enough, I am reproducing it here:

I would say that the processor tool is impressively good for import raw NMR data,and then have them processed by proper phasing, baseline correction (you can have solvent peak removed from the baseline correction, such as big ugly water peak), and more importantly, reference deconvolution, to get a beautiful spectrum to start with. Also reference deconvolution function is a good tool to invetigate coupling constants, when they are pretty small (such as 0.5 Hz for long-range coupling constants. With RD function, you can find that lots of claimed singlets, such as the main peaks of creatinine, are not. Such as aromatic peaks of Histidine, are not either. The RD function is a necessary to have, while none other commercial NMR software has it until now. It really ease the difficuly of shimming. Chenomx should have a better online demo to show how to use it.

written by: Ed

I'll certainly come back soon to Reference Deconvolution and to Chenomx, like to all other makers that are so kind to let me evaluate their works. If you haven't the patience to wait another couple of months... post your review!

Thursday, February 01, 2007

Horizontal vs. Vertical Integrals

This subject was already treated inside an old post. That time horizontal were the winners. Today I witnessed the revenge. I also heard for the first times that there are still ogres in America (I already knew about Shrek, of course, yet I was convinced it was an isolated case).

V. writes to G.

Also, is there some way to get the integrations to be displayed vertically instead of horizontally?

At this point G. copies and pastes the content of the blog:

The integrals are written horizontally. You also wrote your e-mail horizontally. Why do you want to write vertically? It's more difficult to read. Your eyes will soon get tired. There are many tricks to avoid overcrowding of integrals:
reduce the number of digits.
set 1 proton = 100 integral units and you can do without decimal digits.
enlarge some integrals to the right or to the left: the number will be displaced accordingly.
change the font.
use the cutter tool to remove non-interesting regions.
print on larger sheets of paper (we in Europe have the A3 format).
you can merge two integral regions; you will only read the sum, yet it's usually trivial to estimate by eye the contributions of each peak.

and V. demolishes them one by one:

From the tone of your response to the vertical integration request, it seems to me that you have received this request on more than one occasion. Let me try to delineate why... Some of the seven things that you suggest are just not feasible for an organic chemistry grad student, post-doc (such as myself), or faculty member. The size of the paper is dictated by that which the advisor or journal chooses and cannot be modified. Furthermore, some advisors and many referees want to see the exact scale for every spectrum. If you cut part of it out, they will wonder what it is that are you hiding and may reject the submission if it is for a well respected journal. Most organic chemists want to see separate integrations for every identifiable proton signal anything less, and again people wonder what you are hiding... We want to be certain that every single proton integrates accurately within acceptable margins of error. The Ogre at the graduate school in some instances will reject the document if it is not written in a single font style. As for the whole number integrations... we as organic chemists have a convention. It is a widely accepted one, and one that we are unlikely to change. We want to see the integrations represent the exact number of protons with two decimal places, period. The vertical integrations keep the values neatly stacked side by side, like the Rockettes. If you have a spectrum with closely spaced signals, the horizontal integrations take up more space, and when they overlap they become difficult to read.

The last defense of G. was:

I object one main thing: cuts are good. You can't say thay cuts hide the spectrum. Let's say that you print all your 1-H spectra from 0 to 10 ppm. I can argue that you are hiding me some peaks at 11 ppm ! If what you write is valid, the consequence would be that all spectra should be printed from -5 to 15 ppm ! And, even in this case, there is a program, ACD NMR-processor, that lets you easily clean the spectrum from any peak you like...
Nothing prevents you, however, to print a panoramic copy of the spectrum (without integrals) and a fragmented copy with integrals. Also consider the creation of insets.
You are probably the third person to ask me the vertical integrals, and certainly the first one I could not convince to forget them. I had my answer prepared mainly because of my long-term experience with NMR (16 years as spectroscopist into a pharmaceutical industry). I could always print my spectra into an A4 sheet, with horizontal integrals.
Let me clarify one more thing: when I wrote to hide the decimal digit, I mean: first multiply by 100, then truncate.
For example: 1.00 takes more space than 100, but the precision is the same.

V. strikes with a new weapon: the picture

For terpenoids, most impurities will not occur outside of the 0-10 range, they occur within it. For a thesis or supporting material, there should never be any cuts within the 0-10 area where almost all signals are going to occur. Although it does not go to 10 like it should, in my example spectrum you will see two labeled impurity signals. If I cut that portion out, I could claim that my material was pure. The other portion of the spectrum that is labeled contains three closely spaced signals that, if written horizontally, would most certainly overlap. They are clearly three separate signals and should be integrated as such. The use of 1.00 vs 100 is again, a convention. Advisors and referees are creatures of convention and habit. They don't want to see 100, they want to see 1.00. Sure the precision is the same, but trying to get your peers to accept it would take some force. I am not familiar with the ACD NMR-processor and its capabilities, and so I cannot comment on it.
Oh, and the strangely large margin requirements displayed on the attached spectrum are part of the rules to publish a thesis here in the states... Again, this is not something that a grad student has control over, the Ogre at the grad school would reject the thesis if any little portion of your spectrum went into the margin area. The truncation at 8 ppm was a decision based on the small area given to display the spectrum, and the fact that no impurity for this compound should occur downfield of 8 ppm.

G. admits the defeat:

your picture is enough of a convincing proof. Luckily I have no experience with ogres and could always avoid bureaucratic issues. Yours seems to be a nice 500 MHz spectrum, at it is certainly a great experience to see it into the 20" monitor of an iMac, at full screen and possibly with cuts. Don't they even allow insets? These requirements seem so strict that, by this point, they could even specify which software to use.
My conclusion is that certainly vertical integrals allow for more crowding (almost unlimited), but horizontal integrals are more readable. They enter into my eyes and brain without effort. If it was not for bureaucracy, horizontal integrals would be better.

V., however, is not in the mood of celebrating his victory:

Now I hope that you a little better understanding for the plight of the piddley org. chem. grad student. Everyone has all these requirements, but no one wants to make it easy to meet them. Some people still take their spectra tape on the label and scan them... imagine how long that takes when you are dealing with 300 pages of spectra...